Isolated plant gene encoding a β amyrin synthase

ABSTRACT

Disclosed are isolated β-amyrin synthase-encoding cDNA and genomic nucleic acids, which are involved in saponin synthesis. Also provided are complementary sequences, vectors, transformed host cells, transformed plants, and methods for influencing or affecting triterpene synthesis, and hence resistance to a fungal pathogen of a plant using β-amyrin synthase encoding nucleic acids.

This application is a 35 U.S.C. §371 application which claims priority to PCT/GB00/04908 filed Dec. 20, 2000, the disclosure of which is incorporated herein by reference.

The present invention relates to methods and materials, particularly nucleic acids, for manipulating the levels of terpenoids in plants. It further relates to plants which have been modified using such methods and materials.

PRIOR ART

Triterpenoids are of relevance to a variety of plant characteristics, including palatability to animals, and resistance to pathogens and predators. Triterpenes are mostly stored in plant roots as their glycosides, saponins (see Price et al, 1987 CRC Crit Rev Food Sci Nutr 26, 27–133). Thus, for example, mutants of the diploid oat species, Avena strigosa which lack the major oat root saponin, avenacin A-1 (so called saponin-deficient or ‘sad’ mutants) have been shown to have compromised disease resistance (Papadopoulou et al, 1999 Proc Natl Acad Sci 96, 12923–12928). These mutants have increased susceptibility to a number of different root-infecting fungi, including Gaeumannomyces graminis var. tritici, which is normally non-pathogenic to oats. Genetic analysis suggests that increased disease susceptibility and reduced avenacin content are causally related. Furthermore, a sad mutant which produces reduced avenacin levels (around 15% of that of the wild type) gives only limited disease symptoms when inoculated with G. graminis var. tritici in comparison to other mutants which lack avenacins completely, providing a further link between avenacin content and disease resistance.

β-amyrin is the most popular type of triterpene found in higher plants. Two of the Avena strigosa mutants, designated 610 and 109, represent different mutant alleles at the Sad1 locus.

These mutants both lack detectable levels of 2,3-oxidosqualene β-amyrin cyclase (termed hereinafter β-amyrin synthase) which is an oxidosqualene cyclase and is the first committed enzyme in the triterpenoid pathway (see Advances in Lipid Research, 458–461, J. Sanchez, E Cerda-Oledo and E Martinez-Force (eds), 1998). Root preparations of these two mutants also accumulate ¹⁴C-2,3-oxidosqualene and do not produce detectable levels of ¹⁴C-β-amyrin when fed with the radioactive precursor R-[2-¹⁴C] mevalonic acid, again indicating that the step involving the cyclization of 2,3-oxidosqualene to β-amyrin is blocked. It therefore appears that β-amyrin synthase can play a significant role in modifying traits such as disease resistance in plants.

β-amyrin synthase has been cloned from the hairy root of Panax ginseng by Kushiro et al (1998) Eur J Biochem 256: 238–244. These workers used degenerate oligonucleotide primers based on known oxidosqualene synthases. The plant was selected because it is a source of crude drug preparations.

A soybean EST (GenBank Accession AI900929) has also been identified which shares homology with the ginseng sequence.

DISCLOSURE OF THE INVENTION

The present inventors have succeeded in isolating and characterising a cDNA encoding an oxidosqualene cyclase from oat (A. strigosa). This has the characteristics of β-amyrin synthase and has been designated as such herein. An initial attempt to clone the oat gene using primers based on the ginseng sequence was not successful, yielding instead the gene encoding the highly homologous, but functionally distinct, cycloartenol synthase. The cloning of β-amyrin synthase was eventually achieved using a more complicated subtractive cDNA library approach, based on the differential expression of the gene in different parts of the root.

Thus in a first aspect of the present invention there is disclosed a nucleic acid molecule encoding oxidosqualene cyclase from oat which has the characteristics of β-amyrin synthase.

Nucleic acid molecules according to the present invention may be provided isolated and/or purified from their natural environment, in substantially pure or homogeneous form, or free or substantially free of other nucleic acids of the species of origin. Where used herein, the term “isolated” encompasses all of these possibilities.

The nucleic acid molecules may be wholly or partially synthetic. In particular they may be recombinant in that nucleic acid sequences which are not found together in nature (do not run contiguously) have been ligated or otherwise combined artificially. Alternatively they may have been synthesised directly e.g. using an automated synthesiser. They may consist essentially of the gene in question.

Nucleic acid according to the present invention may include cDNA, RNA, genomic DNA and modified nucleic acids or nucleic acid analogs. Where a DNA sequence is specified, e.g. with reference to a figure, unless context requires otherwise the RNA equivalent, with U substituted for T where it occurs, is encompassed. Where a nucleic acid of the invention is referred to herein, the complement of that nucleic acid will also be embraced by the invention. Where genomic nucleic acid sequences of the invention are disclosed, nucleic acids comprising any one or more introns or exons from any of those sequences are also embraced.

Where a nucleic acid (or nucleotide sequence) of the invention is referred to herein, the complement of that nucleic acid (or nucleotide sequence) will also be embraced by the invention. The ‘complement’ in each case is the same length as the reference, but is 100% complementary thereto whereby by each nucleotide is capable of base pairing with its counterpart i.e. G to C, and A to T or U.

The ‘characteristics of β-amyrin synthase’ means that the encoded polypeptide has β-amyrin synthase activity i.e. the ability to catalyse the conversion of 2,3-oxidosqualene into β-amyrin (see discussion in Kushiro et al, 1998, supra, particularly FIG. 1 therein). β-amyrin synthase function may be assessed as set out in the Examples below e.g. using expression in yeast followed by TLC of biosynthetic products, or by complementation in sad mutants.

Nucleic acids of the first aspect may be advantageously utilised in plants to improve resistance against pathogens. For instance, referring to Papadopoulou et al (1999) supra, enzymes which modify levels of saponins such as β-amyrin appear to play a role in resistance against fungi, typically those ascomycetes which have sterol molecules in their membranes.

Target pathogens for such resistance will not be oomycetes, but may include Gaeumannomyces graminis vars tritici and avenae; Fusarium culmorum; Fusarium avanaceum; Stagonospora nodorum; Stagonospora avenae. Other Examples may be those which are set out in “The plant Pathologists Pocket Book” 2^(nd) Ed. compiled by Commonwealth Mycological Institute, Kew, Surrey and published by Commonwealth Agricultural Bureaux, England ISBN 0851985173, particularly on pages 102–103 (affecting barley, broad bean, sugar beet, Brassicas); pages 110–113 (affecting lettuce, maize and oat); pages 116–121 (affecting potato, rice, rye, sorghum, soybean, spruce, strawberry, sugarcane; sunflower; tomato; wheat).

Plants over expressing the enzyme may also be useful sources of oleanane type triterpene saponins for the chemical or pharmaceutical industries e.g. for use in the preparation of antimicrobial phytoprotectants, or drugs. Plants having modified levels of saponins will may also have a modified taste and/or nutritional value.

Plants in which it may be desirable in principle be desirable to express, or over express, nucleic acids of the present invention may include any of those discussed above. Particularly preferred may be barley, bean (phaseolus), pea, sugar beet, maize; oat; solanum (e.g. potato); allium (e.g. garlic, onion and leek); asparagus; tea; peanut; spinach; cucurbitaceae; yam; rice; rye; sorghum; soyabean; spruce; strawberry; sugarcane; sunflower; tomato; wheat (see also Price et al, supra).

Most preferably the invention is employed using rice, wheat, maize or barley.

Thus in one embodiment of this aspect of the invention, there is disclosed a nucleic acid comprising the OAT β-amyrin synthase nucleotide sequence shown in Annex (I) or a sequence being degeneratively equivalent thereto.

This embodiment embraces any isolated nucleic acid encoding the OAT β-amyrin synthase amino acid sequence shown in Annex (II). Clearly such a nucleic acid may comprise the encoding sequence in Annex (I), or Annex (VIII).

In a further aspect of the present invention there are disclosed nucleic acids which are variants of the sequences of the first aspect.

A variant nucleic acid molecule shares homology with, or is identical to, all or part of the coding sequence discussed above. Generally, variants may encode, or be used to isolate or amplify nucleic acids which encode, polypeptides which are capable of modifying terpenoid, (particularly triterpenoid, more particularly saponin) synthesis in a plant, and hence alter the pathogen resistance of that plant, and/or which will specifically bind to an antibody raised against the polypeptide of Annex (II). The triterpenoid synthetic function, particularly β-amyrin synthase,function, may be assessed as set out in the Examples below.

Variants of the present invention can be artificial nucleic acids (i.e. containing sequences which have not originated naturally) which can be prepared by the skilled person in the light of the present disclosure. Alternatively they may be novel, naturally occurring, nucleic acids, which may be isolatable using the sequences of the present invention.

Thus a variant may be a distinctive part or fragment (however produced) corresponding to a portion of the sequence provided. The fragments may encode particular functional parts of the polypeptide.

Equally the fragments may have utility in probing for, or amplifying, the sequence provided or closely related ones. Suitable lengths of fragment, and conditions, for such processes are discussed in more detail below.

Also included are nucleic acids which have been extended at the 3′ or 5′ terminus.

Sequence variants which occur naturally may include alleles or other homologues (which may include polymorphisms or mutations at one or more bases).

Artificial variants (derivatives) may be prepared by those skilled in the art, for instance by site directed or random mutagenesis, or by direct synthesis. Preferably the variant nucleic acid is generated either directly or indirectly (e.g. via one or amplification or replication steps) from an original nucleic acid having all or part of the sequences of the first aspect. Preferably it encodes a β-amyrin synthase.

The term ‘variant’ nucleic acid as used herein encompasses all of these possibilities. When used in the context of polypeptides or proteins it indicates the encoded expression product of the variant nucleic acid.

Some of the aspects of the present invention relating to variants will now be discussed in more detail.

Homology (i.e. similarity or identity) may be as defined using sequence comparisons are made using FASTA and FASTP (see Pearson & Lipman, 1988. Methods in Enzymology 183: 63–98). Parameters are preferably set, using the default matrix, as follows:

Gapopen (penalty for the first residue in a gap): −12 for proteins/−16 for DNA

Gapext (penalty for additional residues in a gap): −2 for proteins/−4 for DNA

KTUP word length: 2 for proteins/6 for DNA.

Homology may be at the nucleotide sequence and/or encoded amino acid sequence level. Preferably, the nucleic acid and/or amino acid sequence shares at least about 60%, or 70%, or 80% homology, most preferably at least about 90%, 95%, 96%, 97%, 98% or 99% homology with oat β-amyrin synthase.

Thus a variant polypeptide in accordance with the present invention may include within the sequence shown in Annex V, a single amino acid or 2, 3, 4, 5, 6, 7, 8, or 9 changes, about 10, 15, 20, 30, 40 or 50 changes, or greater than about 50, 60, 70, 80, 90, 100, 200, 300 changes. In addition to one or more changes within the amino acid sequence shown, a variant polypeptide may include additional amino acids at the C-terminus and/or N-terminus.

Naturally, regarding nucleic acid variants, changes to the nucleic acid which make no difference to the encoded polypeptide (i.e. ‘degeneratively equivalent’) are included within the scope of the present invention.

Thus in a further aspect of the invention there is disclosed a method of producing a derivative nucleic acid comprising the step of modifying the coding sequence of a nucleic acid of Annex I.

Changes to a sequence, to produce a derivative, may be by one or more of addition, insertion, deletion or substitution of one or more nucleotides in the nucleic acid, leading to the addition, insertion, deletion or substitution of one or more amino acids in the encoded polypeptide.

Changes may be desirable for a number of reasons, including introducing or removing the following features: restriction endonuclease sequences; codon usage; other sites which are required for post translation modification; cleavage sites in the encoded polypeptide; motifs in the encoded polypeptide (e.g. binding sites). Leader or other targeting sequences (e.g. hydrophobic anchoring regions) may be added or removed from the expressed protein to determine its location following expression. All of these may assist in efficiently cloning and expressing an active polypeptide in recombinant form (as described below).

Other desirable mutation may be random or site directed mutagenesis in order to alter the activity (e.g. specificity) or stability of the encoded polypeptide.

Changes may be by way of conservative variation, i.e. substitution of one hydrophobic residue such as isoleucine, valine, leucine or methionine for another, or the substitution of one polar residue for another, such as arginine for lysine, glutamic for aspartic acid, or glutamine for asparagine. As is well known to those skilled in the art, altering the primary structure of a polypeptide by a conservative substitution may not significantly alter the activity of that peptide because the side-chain of the amino acid which is inserted into the sequence may be able to form similar bonds and contacts as the side chain of the amino acid which has been substituted out. This is so even when the substitution is in a region which is critical in determining the peptides conformation.

Also included are variants having non-conservative substitutions. As is well known to those skilled in the art, substitutions to regions of a peptide which are not critical in determining its conformation may not greatly affect its activity because they do not greatly alter the peptide's three dimensional structure.

In regions which are critical in determining the peptides conformation or activity such changes may confer advantageous properties on the polypeptide. Indeed, changes such as those described above may confer slightly advantageous properties on the peptide e.g. altered stability or specificity.

In a further aspect of the present invention there is provided a method of identifying and/or cloning a nucleic acid variant from a plant which method employs a probe or primer of the present invention. Target plants include (but are not limited to) rice, maize, wheat, barley, alfalfa, chickpea, bean and pea.

An oligonucleotide for use in probing or amplification reactions comprise or consist of about 48, 36 or fewer nucleotides in length (e.g. 18, 21 or 24). Generally specific primers are upwards of 14 nucleotides in length. For optimum specificity and cost effectiveness, primers of 16–30 nucleotides in length may be preferred. Those skilled in the art are well versed in the design of primers for use processes such as PCR. If required, probing can be done with entire restriction fragments of the gene disclosed herein which may be 100's or even 2000 or more nucleotides in length.

Generally speaking, two classes of probe/primer form a part of the present invention.

For instance, one class of primer, typified by those shown in Table 1, is distinctive in the sense that it is based on any region present in the β-amyrin synthase sequence disclosed herein, but not in any region of the non-closely related sequences of the prior art (e.g. the soyabean or ginseng sequences discussed supra). ‘Based on’ in this sense can mean that the primer is found in or degeneratively equivalent to the β-amyrin synthase region, or its complement. Examples are shown in Table 1. Such primers will have utility not only in manipulating the oat β-amyrin synthase sequence, but also in those which are expected to be more closely related to it e.g. other monocot β-amyrin synthase genes, such as those from rice, maize, wheat, barley etc.

A second class of primer, typified by those shown in Table 2, is generic in the sense that it is based on regions of β-amyrin synthase genes (or related genes) which appear to be conserved between e.g. the soyabean/ginseng sequences and the oat sequence. Such primers, devised on the basis of the oat β-amyrin synthase gene, will have utility in manipulating amyrin synthase and related sequences in general. Such primers also include any which are based on the conserved DCTAE motif present in the predicted amino acid sequences of several triterpene biosynthetic enzymes from different species (see Annex VI).

A further class of primers is shown in Table 3, which were used in the Examples hereinafter to amplify regions of the genomic sequence of oat β-amyrin synthase gene. Other primers, shown in the Examples below, and used to identify the genomic sequence, were based on cDNA sequence (AMYstaF5 based on the initiation codon, and AMYendR5 based on the 3′ UTR region).

Some other primers of the invention are designated ASEQ1-4 in the Examples below.

In one embodiment, nucleotide sequence information provided herein may be used in a data-base (e.g. of expressed sequence tags, or sequence tagged sites) search to find homologous sequences, such as those which may become available in due course, and expression products of which can be tested for activity as described below.

In a further embodiment, a variant in accordance with the present invention is also obtainable by means of a method which includes:

-   -   (a) providing a preparation of nucleic acid, e.g. from plant         cells,     -   (b) providing a nucleic acid molecule which is a probe as         described above,     -   (c) contacting nucleic acid in said preparation with said         nucleic acid molecule under conditions for hybridisation of said         nucleic acid molecule to any said gene or homologue in said         preparation, and identifying said gene or homologue if present         by its hybridisation with said nucleic acid molecule.

Probing may employ the standard Southern blotting technique. For instance DNA may be extracted from cells and digested with different restriction enzymes. Restriction fragments may then be separated by electrophoresis on an agarose gel, before denaturation and transfer to a nitrocellulose filter. Labelled probe may be hybridised to the DNA fragments on the filter and binding determined. DNA for probing may be prepared from RNA preparations from cells.

Test nucleic acid may be provided from a cell as genomic DNA, cDNA or RNA, or a mixture of any of these, preferably as a library in a suitable vector. If genomic DNA is used the probe may be used to identify untranscribed regions of the gene (e.g. promoters etc.), such as is described hereinafter. Probing may optionally be done by means of so-called ‘nucleic acid chips’ (see Marshall & Hodgson (1998) Nature Biotechnology 16: 27–31, for a review).

Preliminary experiments may be performed by hybridising under low stringency conditions. For probing, preferred conditions are those which are stringent enough for there to be a simple pattern with a small number of hybridisations identified as positive which can be investigated further.

For instance, screening may initially be carried out under conditions, which comprise a temperature of about 37° C. or less, a formamide concentration of less than about 50%, and a moderate to low salt (e.g. Standard Saline Citrate (‘SSC’)=0.15 M sodium chloride; 0.15 M sodium citrate; pH 7) concentration.

Alternatively, a temperature of about 50° C. or less and a high salt (e.g. ‘SSPE’=0.180 mM sodium chloride; 9 mM disodium hydrogen phosphate; 9 mM sodium dihydrogen phosphate; 1 mM sodium EDTA; pH 7.4). Preferably the screening is carried out at about 37° C., a formamide concentration of about 20%, and a salt concentration of about 5×SSC, or a temperature of about 50° C. and a salt concentration of about 2×SSPE. These conditions will allow the identification of sequences which have a substantial degree of homology (similarity, identity) with the probe sequence, without requiring the perfect homology for the identification of a stable hybrid.

Suitable conditions include, e.g. for detection of sequences that are about 80–90% identical, hybridization overnight at 42° C. in 0.25M Na₂HPO₄, pH 7.2, 6.5% SDS, 10% dextran sulfate and a final wash at 55° C. in 0.1×SSC, 0.1% SDS. For detection of sequences that are greater than about 90% identical, suitable conditions include hybridization overnight at 65° C. in 0.25M Na₂HPO₄, pH 7.2, 6.5% SDS, 10% dextran sulfate and a final wash at 60° C. in 0.1×SSC, 0.1% SDS.

It is well known in the art to increase stringency of hybridisation gradually until only a few positive clones remain. Suitable conditions would be achieved when a large number of hybridising fragments were obtained while the background hybridisation was low. Using these conditions nucleic acid libraries, e.g. cDNA libraries representative of expressed sequences, may be searched. Those skilled in the art are well able to employ suitable conditions of the desired stringency for selective hybridisation, taking into account factors such as oligonucleotide length and base composition, temperature and so on.

One common formula for calculating the stringency conditions required to achieve hybridization between nucleic acid molecules of a specified sequence homology is (Sambrook et al., 1989): T_(m)=81.5° C.+16.6 Log [Na+]+0.41(% G+C)−0.63(% formamide)−600/#bp in duplex

As an illustration of the above formula, using [Na+]=[0.368] and 50−% formamide, with GC content of 42% and an average probe size of 200 bases, the T_(m) is 57° C. The T_(m) of a DNA duplex decreases by 1–1.5° C. with every 1% decrease in homology. Thus, targets with greater than about 75% sequence identity would be observed using a hybridization temperature of 42° C. Such a sequence would be considered substantially homologous to the nucleic acid sequence of the present invention.

Binding of a probe to target nucleic acid (e.g. DNA) may be measured using any of a variety of techniques at the disposal of those skilled in the art. For instance, probes may be radioactively, fluorescently or enzymatically labelled. Other methods not employing labelling of probe include amplification using PCR (see below) or RN'ase cleavage. The identification of successful hybridisation is followed by isolation of the nucleic acid which has hybridised, which may involve one or more steps of PCR or amplification of a vector in a suitable host.

Thus one embodiment of this aspect of the present invention is nucleic acid including or consisting essentially of a sequence of nucleotides complementary to a nucleotide sequence hybridisable with any encoding sequence provided herein. Another way of looking at this would be for nucleic acid according to this aspect to be hybridisable with a nucleotide sequence complementary to any encoding sequence provided herein. Of course, DNA is generally double-stranded and blotting techniques such as Southern hybridisation are often performed following separation of the strands without a distinction being drawn between which of the strands is hybridising. Preferably the hybridisable nucleic acid or its complement encode a product able to influence a resistance characteristic of a plant, particularly via modification of triterpenoid synthesis.

In a further embodiment, hybridisation of nucleic acid molecule to a variant may be determined or identified indirectly, e.g. using a nucleic acid amplification reaction, particularly the polymerase chain reaction (PCR). PCR requires the use of two primers to amplify target nucleic acid, so preferably two primers as described above are employed. Using RACE PCR, one ‘random’ may be used (see “PCR protocols; A Guide to Methods and Applications”, Eds. Innis et al, Academic Press, New York, (1990)).

Thus a method involving use of PCR in obtaining nucleic acid according to the present invention may be carried out as described above, but using a pair of nucleic acid molecule primers useful in (i.e. suitable for) PCR, at least one of which is a primer of the present invention as described above.

In each case above, if need be, clones or fragments identified in the search can be extended. For instance if it is suspected that they are incomplete, the original DNA source (e.g. a clone library, mRNA preparation etc.) can be revisited to isolate missing portions e.g. using sequences, probes or primers based on that portion which has already been obtained to identify other clones containing overlapping sequence.

The methods described above may also be used to determine the presence of one of the nucleotide sequences of the present invention within the genetic context of an individual plant, optionally a transgenic plant, which may be produced as described in more detail below. This may be useful in plant breeding programmes e.g. to directly select plants containing alleles which are responsible for desirable traits in that plant species, either in parent plants or in progeny (e.g. hybrids, F1, F2 etc.). Thus use of particular novel markers defined in the Examples below, or markers which can be designed by those skilled in the art on the basis the nucleotide sequence information disclosed herein, forms one part of the present invention.

As used hereinafter, unless the context demands otherwise, the term “β-amyrin synthase nucleic acid” is intended to cover any of the nucleic acids of the invention described above, including functional variants.

In one aspect of the present invention, the β-amyrin synthase nucleic acid described above is in the form of a recombinant and preferably replicable vector.

“Vector” is defined to include, inter alia, any plasmid, cosmid, phage or Agrobacterium binary vector in double or single stranded linear or circular form which may or may not be self transmissible or mobilizable, and which can transform prokaryotic or eukaryotic host either by integration into the cellular genome or exist extrachromosomally (e.g. autonomous replicating plasmid with an origin of replication).

Specifically included are shuttle vectors by which is meant a DNA vehicle capable, naturally or by design, of replication in two different host organisms, which may be selected from actinomycetes and related species, bacteria and eucaryotic (e.g. higher plant, mammalian, yeast or fungal cells).

A vector including nucleic acid according to the present invention need not include a promoter or other regulatory sequence, particularly if the vector is to be used to introduce the nucleic acid into cells for recombination into the genome.

Preferably the nucleic acid in the vector is under the control of, and operably linked to, an appropriate promoter or other regulatory elements for transcription in a host cell such as a microbial, e.g. bacterial, or plant cell. The vector may be a bi-functional expression vector which functions in multiple hosts. In the case of genomic DNA, this may contain its own promoter or other regulatory elements and in the case of cDNA this may be under the control of an appropriate promoter or other regulatory elements for expression in the host cell

By “promoter” is meant a sequence of nucleotides from which transcription may be initiated of DNA operably linked downstream (i.e. in the 3′ direction on the sense strand of double-stranded DNA).

“Operably linked” means joined as part of the same nucleic acid molecule, suitably positioned and oriented for transcription to be initiated from the promoter. DNA operably linked to a promoter is “under transcriptional initiation regulation” of the promoter.

Thus this aspect of the invention provides a gene construct, preferably a replicable vector, comprising a promoter operatively linked to a nucleotide sequence provided by the present invention, such as β-amyrin synthase or a variant thereof.

Generally speaking, those skilled in the art are well able to construct vectors and design protocols for recombinant gene expression. Suitable vectors can be chosen or constructed, containing appropriate regulatory sequences, including promoter sequences, terminator fragments, polyadenylation sequences, enhancer sequences, marker genes and other sequences as appropriate. For further details see, for example, Molecular Cloning: a Laboratory Manual: 2nd edition, Sambrook et al, 1989, Cold Spring Harbor Laboratory Press (or later editions of this work).

Many known techniques and protocols for manipulation of nucleic acid, for example in preparation of nucleic acid constructs, mutagenesis (see above discussion in respect of variants), sequencing, introduction of DNA into cells and gene expression, and analysis of proteins, are described in detail in Current Protocols in Molecular Biology, Second Edition, Ausubel et al. eds., John Wiley & Sons, 1992. The disclosures of Sambrook et al. and Ausubel et al. are incorporated herein by reference.

In one embodiment of this aspect of the present invention, there is provided a gene construct, preferably a replicable vector, comprising an inducible promoter operatively linked to a nucleotide sequence provided by the present invention.

The term “inducible” as applied to a promoter is well understood by those skilled in the art. In essence, expression under the control of an inducible promoter is “switched on” or increased in response to an applied stimulus. The nature of the stimulus varies between promoters. Some inducible promoters cause little or undetectable levels of expression (or no expression) in the absence of the appropriate stimulus. Other inducible promoters cause detectable constitutive expression in the absence of the stimulus. Whatever the level of expression is in the absence of the stimulus, expression from any inducible promoter is increased in the presence of the correct stimulus.

Particular of interest in the present context are nucleic acid constructs which operate as plant vectors. Specific procedures and vectors previously used with wide success upon plants are described by Guerineau and Mullineaux (1993) (Plant transformation and expression vectors. In: Plant Molecular Biology Labfax (Croy RRD ed) Oxford, BIOS Scientific Publishers, pp 121–148).

Suitable promoters which operate in plants include the Cauliflower Mosaic Virus 35S (CaMV 35S). Other examples are disclosed at pg 120 of Lindsey & Jones (1989) “Plant Biotechnology in Agriculture” Pub. OU Press, Milton Keynes, UK. The promoter may be selected to include one or more sequence motifs or elements conferring developmental and/or tissue-specific regulatory control of expression. Inducible plant promoters include the ethanol induced promoter of Caddick et al (1998) Nature Biotechnology 16: 177–180. It may be desirable to use a strong constitutive promoter such as the ubiquitin promoter, particularly in monocots.

If desired, selectable genetic markers may be included in the construct, such as those that confer selectable phenotypes such as resistance to antibiotics or herbicides (e.g. kanamycin, hygromycin, phosphinotricin, chlorsulfuron, methotrexate, gentamycin, spectinomycin, imidazolinones and glyphosate).

The present invention also provides methods comprising introduction of such a construct into a host cell, particularly a plant cell.

In a further aspect of the invention, there is disclosed a host cell containing a heterologous construct according to the present invention, especially a plant or a microbial cell.

The term “heterologous” is used broadly in this aspect to indicate that the gene/sequence of nucleotides in question (a β-amyrin synthase gene) have been introduced into said cells of the plant or an ancestor thereof, using genetic engineering, i.e. by human intervention. A heterologous gene may replace an endogenous equivalent gene, i.e. one which normally performs the same or a similar function, or the inserted sequence may be additional to the endogenous gene or other sequence.

Nucleic acid heterologous to a plant cell may be non-naturally occurring in cells of that type, variety or species. Thus the heterologous nucleic acid may comprise a coding sequence of or derived from a particular type of plant cell or species or variety of plant, placed within the context of a plant cell of a different type or species or variety of plant. A further possibility is for a nucleic acid sequence to be placed within a cell in which it or a homolog is found naturally, but wherein the nucleic acid sequence is linked and/or adjacent to nucleic acid which does not occur naturally within the cell, or cells of that type or species or variety of plant, such as operably linked to one or more regulatory sequences, such as a promoter sequence, for control of expression.

The host cell (e.g. plant cell) is preferably transformed by the construct, which is to say that the construct becomes established within the cell, altering one or more of the cell's characteristics and hence phenotype e.g. with respect to triterpenoid synthesis and/or fungal pathogen resistance.

Nucleic acid can be transformed into plant cells using any suitable technology, such as a disarmed Ti-plasmid vector carried by Agrobacterium exploiting its natural gene transfer ability (EP-A-270355, EP-A-0116718, NAR 12(22) 8711–87215 1984), particle or microprojectile bombardment (U.S. Pat. No. 5,100,792, EP-A-444882, EP-A-434616) microinjection (WO 92/09696, WO 94/00583, EP 331083, EP 175966, Green et al. (1987) Plant Tissue and Cell Culture, Academic Press), electroporation (EP 290395, WO 8706614 Gelvin Debeyser) other forms of direct DNA uptake (DE 4005152, WO 9012096, U.S. Pat. No. 4,684,611), liposome mediated DNA uptake (e.g. Freeman et al. Plant Cell Physiol. 29: 1353 (1984)), or the vortexing method (e.g. Kindle, PNAS U.S.A. 87: 1228 (1990d) Physical methods for the transformation of plant cells are reviewed in Oard, 1991, Biotech. Adv. 9: 1–11.

Agrobacterium transformation is widely used by those skilled in the art to transform dicotyledonous species. Recently, there has also been substantial progress towards the routine production of stable, fertile transgenic plants in almost all economically relevant monocot plants (see e.g. Hiei et al. (1994) The Plant Journal 6, 271–282)). Microprojectile bombardment, electroporation and direct DNA uptake are preferred where Agrobacterium alone is inefficient or ineffective. Alternatively, a combination of different techniques may be employed to enhance the efficiency of the transformation process, eg bombardment with Agrobacterium coated microparticles (EP-A-486234) or microprojectile bombardment to induce wounding followed by co-cultivation with Agrobacterium (EP-A-486233).

It will be apparent to the skilled person that the particular choice of a transformation system to introduce nucleic acid into plant cells is not essential to or a limitation of the invention, nor is the choice of technique for plant regeneration.

Thus a further aspect of the present invention provides a method of transforming a plant cell involving introduction of a construct as described above into a plant cell and causing or allowing recombination between the vector and the plant cell genome to introduce a nucleic acid according to the present invention into the genome.

The invention further encompasses a host cell transformed with nucleic acid or a vector according to the present invention (e.g. comprising oat β-amyrin synthase sequence) especially a plant or a microbial cell. In the transgenic plant cell (i.e. transgenic for the nucleic acid in question) the transgene may be on an extra-genomic vector or incorporated, preferably stably, into the genome. There may be more than one heterologous nucleotide sequence per haploid genome.

Generally speaking, following transformation, a plant may be regenerated, e.g. from single cells, callus tissue or leaf discs, as is standard in the art. Almost any plant can be entirely regenerated from cells, tissues and organs of the plant. Available techniques are reviewed in Vasil et al., Cell Culture and Somatic Cell Genetics of Plants, Vol I, II and III, Laboratory Procedures and Their Applications, Academic Press, 1984, and Weissbach and Weissbach, Methods for Plant Molecular Biology, Academic Press, 1989.

The generation of fertile transgenic plants has been achieved in the cereals rice, maize, wheat, oat, and barley (reviewed in Shimamoto, K. (1994) Current Opinion in Biotechnology 5, 158–162.; Vasil, et al. (1992) Bio/Technology 10, 667–674; Vain et al., 1995, Biotechnology Advances 13 (4): 653–671; Vasil, 1996, Nature Biotechnology 14 page 702).

Plants which include a plant cell according to the invention are also provided.

In addition to the regenerated plant, the present invention embraces all of the following: a clone of such a plant, selfed or hybrid progeny and descendants (e.g. F1 and F2 descendants) and any part of any of these. The invention also provides parts of such plants e.g. any part which may be used in reproduction or propagation, sexual or asexual, including cuttings, seed and so on, or which may be a commodity per se e.g. grain.

A plant according to the present invention may be one which does not breed true in one or more properties. Plant varieties may be excluded, particularly registrable plant varieties according to Plant Breeders' Rights.

The invention further provides a method of influencing or affecting the nature or degree of the triterpenoid (e.g. β-amyrin) synthesis in a plant, and thereby optionally its resistance to a fungal pathogen, the method including the step of causing or allowing expression of a heterologous nucleic acid sequence as discussed above within the cells of the plant. This aspect preferably involves use of the sequence of Annex I (or the polypeptide of Annex II).

Analogous methods for altering taste, palatability etc. of a plant are also embraced.

The step may be preceded by the earlier step of introduction of the nucleic acid into a cell of the plant or an ancestor thereof.

The foregoing discussion has been generally concerned with uses of the nucleic acids of the present invention for production of functional β-amyrin synthase polypeptides in a plant, thereby increasing its pathogen resistance. However the information disclosed herein may also be used to reduce the activity or levels of such polypeptides in cells in which it is desired to do so. Reduction in triterpenoid synthesis and downstream secondary metabolites. Triterpenoids exhibit a wide variety of functions in biological systems and in principle their down-regulation may be desirable to modify these functions. For instance the palatability of a plants may be improved by reducing levels of these compounds.

The sequence information disclosed herein may be used for the down-regulation of expression of genes e.g. using anti-sense technology (see e.g. Bourque, (1995), Plant Science 105, 125–149); sense regulation [co-suppression] (see e.g. Zhang et al., (1992) The Plant Cell 4, 1575–1588). Further options for down regulation of gene expression include the use of ribozymes, e.g. hammerhead ribozymes, which can catalyse the site-specific cleavage of RNA, such as mRNA (see e.g. Jaeger (1997) “The new world of ribozymes” Curr Opin Struct Biol 7:324–335.

In using anti-sense genes or partial gene sequences to down-regulate gene expression, a nucleotide sequence is placed under the control of a promoter in a “reverse orientation” such that transcription yields RNA which is complementary to normal mRNA transcribed from the “sense” strand of the target gene. See, for example, Rothstein et al, 1987; Smith et al, (1988) Nature 334, 724–726; Zhang et al, (1992) The Plant Cell 4, 1575–1588, English et al., (1996) The Plant Cell 8, 179–188. Antisense technology is also reviewed in Flavell, (1994) PNAS USA 91, 3490–3496.

An alternative to anti-sense is to use a copy of all or part of the target gene inserted in sense, that is the same, orientation as the target gene, to achieve reduction in expression of the target gene by co-suppression. See, for example, van der Krol et al., (1990) The Plant Cell 2, 291–299; Napoli et al., (1990) The Plant Cell 2, 279–289 and U.S. Pat. No. 5,231,020. Further refinements of the gene silencing or co-suppression technology may be found in WO 95/34668 (Biosource); Angell & Baulcombe (1997) The EMBO Journal 16,12:3675–3684; and Voinnet & Baulcombe (1997) Nature 389: pg 553.

Nucleic acids and associated methodologies for carrying out down-regulation (e.g. complementary sequences) form one part of the present invention.

The present invention also encompasses the expression product of any of the β-amyrin synthase (particularly functional β-amyrin synthase) nucleic acid sequences disclosed above, plus also methods of making the expression product by expression from encoding nucleic acid therefore under suitable conditions, which may be in suitable host cells.

A preferred polypeptide includes the amino acid sequence shown in Annex II. However a polypeptide according to the present invention may be a variant (allele, fragment, derivative, mutant or homologue etc.) of the polypeptide as shown in Annex II. The allele, variant, fragment, derivative, mutant or homologue may have substantially the β-amyrin synthase function of the amino acid sequence shown in Annex II.

Also encompassed by the present invention are polypeptides which although clearly related to a functional oat β-amyrin synthase polypeptide (e.g. they are immunologically cross reactive with the oat β-amyrin synthase polypeptide, or they have characteristic sequence motifs in common with the β-amyrin synthase polypeptide) no longer have β-amyrin synthase function.

Following expression, the recombinant product may, if required, be isolated from the expression system. Generally however the polypeptides of the present invention will be used in vivo (in particular in planta).

Purified oat β-amyrin synthase or variant protein, produced recombinantly by expression from encoding nucleic acid therefor, may be used to raise antibodies employing techniques which are standard in the art. Methods of producing antibodies include immunising a mammal (e.g. mouse, rat, rabbit, horse, goat, sheep or monkey) with the protein or a fragment thereof. Antibodies may be obtained from immunised animals using any of a variety of techniques known in the art, and might be screened, preferably using binding of antibody to antigen of interest. For instance, Western blotting techniques or immunoprecipitation may be used (Armitage et al, 1992, Nature 357: 80–82). Antibodies may be polyclonal or monoclonal. As an alternative or supplement to immunising a mammal, antibodies with appropriate binding specificity may be obtained from a recombinantly produced library of expressed immunoglobulin variable domains, e.g. using lambda bacteriophage or filamentous bacteriophage which display functional immunoglobulin binding domains on their surfaces; for instance see WO 92/01047.

Antibodies raised to a polypeptide or peptide can be used in the identification and/or isolation of homologous polypeptides, and then the encoding genes. Thus, the present invention provides a method of identifying or isolating a polypeptide with β-amyrin synthase function (in accordance with embodiments disclosed herein), including screening candidate peptides or polypeptides with a polypeptide including the antigen-binding domain of an antibody (for example whole antibody or a fragment thereof) which is able to bind an oat β-amyrin synthase peptide, polypeptide or fragment, variant or variant thereof or preferably has binding specificity for such a peptide or polypeptide, such as having an amino acid sequence identified herein. Specific binding members such as antibodies and polypeptides including antigen binding domains of antibodies that bind and are preferably specific for a β-amyrin synthase peptide or polypeptide or mutant, variant or derivative thereof represent further aspects of the present invention, as do their use and methods which employ them.

Candidate peptides or polypeptides for screening may for instance be the products of an expression library created using nucleic acid derived from an plant of interest, or may be the product of a purification process from a natural source.

In addition to the coding parts of the oat β-amyrin synthase gene, and the expression products, also embraced within the present invention are untranscribed parts of the gene.

Thus a further aspect of the invention is an isolated nucleic acid molecule encoding the promoter of the oat β-amyrin synthase gene. This is shown in the sequence Annexes hereinafter.

Also included are homologous variants, or portions, of the promoter, which have “promoter activity” i.e. the ability to initiate transcription. The level of promoter activity is quantifiable for instance by assessment of the amount of mRNA produced by transcription from the promoter or by assessment of the amount of protein product produced by translation of mRNA produced by transcription from the promoter. The amount of a specific mRNA present in an expression system may be determined for example using specific oligonucleotides which are able to hybridise with the mRNA and which are labelled or may be used in a specific amplification reaction such as the polymerase chain reaction. Use of a reporter gene facilitates determination of promoter activity by reference to protein production. The reporter gene preferably encodes an enzyme which catalyses a reaction which produces a detectable signal, preferably a visually detectable signal, such as a coloured product. Many examples are known, including β-galactosidase and luciferase. The presence and/or amount of gene product resulting from expression from the reporter gene may be determined using a molecule able to bind the product, such as an antibody or fragment thereof. The binding molecule may be labelled directly or indirectly using any standard technique. Those skilled in the art are well aware of a multitude of possible reporter genes and assay techniques which may be used to determine promoter activity. Any suitable reporter/assay may be used and it should be appreciated that no particular choice is essential to or a limitation of the present invention.

Homologous variants can be prepared or identified in similar manner to those described above in relation to the coding sequence. To find minimal elements or motifs responsible for activity or regulation, restriction enzyme or nucleases may be used to digest a nucleic acid molecule, or mutagenesis may be employed, followed by an appropriate assay (for example using a reporter gene such as luciferase) to determine the sequence required. Nucleic acid comprising these elements or motifs forms one part of the present invention.

Certain regions of the promoter which are believed to bind transcription factors (based on computational analysis) are identified in the sequence Annexes below. Such fragments including one or more of these regions, and having promoter activity, may be particularly preferred.

In a further aspect of the invention there is provided a nucleic acid construct, preferably an expression vector, including the oat β-amyrin synthase gene promoter region or fragment, mutant, derivative or other homologue or variant thereof able to promote transcription, operably linked to a heterologous gene, e.g. a coding sequence, which is preferably not the coding sequence with which the promoter is operably linked in nature.

The invention will now be further described with reference to the following non-limiting Examples and Annexes. Other embodiments of the invention will occur to those skilled in the art in the light of these.

Annexes

(I)—OAT β-amyrin synthase cDNA (nucleotide sequence)

(II)—OAT β-amyrin synthase cDNA (amino acid sequence)

(III)—ort1s.pk001.c14 (putative oxidosqualene cyclase EST clone from Avena strigosa)

(IV)—CLUSTAL W (1.8) multiple sequence alignment of 6 clones encoding the 5′ end of an oxidosqualene cyclase from A. strigosa

(V)—CLUSTAL W (1.8) multiple sequence alignment of 6 clones encoding the 3′ end of an oxidosqualene cyclase from A. strigosa

(VI)—Alignment of the DCTAE motif in the predicted amino acid sequences of triterpene biosynthetic enzymes of different species

(VII)—Alignment of the β-amyrin synthase with cycloarternol synthase of A. strigosa

(VIII)—The complete genomic sequence (from the translational start codon (ATG) to the stop codon (TGA)) of the β-amyrin synthase from Avena strigosa. Uppercase letters represent the exons and lowercase letters represent the introns. The splice-junction sites were predicted according to the cDNA sequence and by using the “BCM gene finder” web page at the Baylor College of Medicine and the “Splice Site Prediction by Neural Network” web page at the University of California. The gene consists of 18 exons and 17 introns.

(IX)—The complete sequence of the 1941 bp of the β-amyrin synthase promoter. The initiation of translation start codon is given in uppercase letters. The initiation of transcription start site is underlined and was predicted using the “Promoter Prediction by Neural Network” web page at the University of California. The underlined tatataa sequence represents the putative TATA signal predicted with the HCtata (Hamming-Clustering Method for TATA Signal Prediction in Eukaryotic Genes) interactive program at the “WEBGENE” web page of the Institute of Advanced Biomedical Technologies (ITBA).

(X)—Computational analysis of the promoter region using the MatInspector program at the Genomatix web page in Germany. This revealed several putative transcription factor binding sites. Parameters used for the search were: Plant databank-section, Core similarity 0.75 and Matrix similarity 0.85. Above and below the putative binding site position is given the name, the sense (+) or antisense (−) direction, the consensus sequence and the matrix similarity of the corresponding transcription factor.

Tables

Table 1—Specific primers for amplifying A. strigosa β-amyrin synthase

Table 2—Primers for amplifying β-amyrin synthases

Table 3—Primers for amplifying β-amyrin synthases genomic sequence.

The nucleotide codes are as follows:

M=A OR C

R=A OR G

W=A OR T

S=C OR G

Y=C OR T

K=G OR T

V=A OR C OR G

H=A OR C OR T

D=A OR G OR T

B=C OR G OR T

N=A OR C OR G OR T

Blast Searches

BLAST Analysis using partial cDNA sequence (clone ort1s.pk001.c14)

Further Blastx searches using β-amyrin synthase sequence.

EXAMPLES Example 1 Construction of Oat Root cDNA Libraries

Two cDNA libraries were constructed and sequenced using standard techniques. These cDNA libraries were all derived from A. strigosa accession number S75 (from the Institute of Grasslands and Environmental Research, Aberystwyth, Wales, UK). The libraries were as follows:

1. ort1s: A subtractive library derived from RNA from wild type (WT) oat root tip material(terminal 5 mm) subtracted against RNA from the remainder of the root.

2. ort1f: A full length cDNA library derived from RNA from root tips of the WT oat line (unsubtracted).

ort1s: WT (S75) root tip vs. remainder of the root subtractive cDNA library

A. strigosa (WT) seeds were surface sterilized with 5% sodium hypochlorite and washed several times with sterile deionized water. After a cold treatment for 48 h at 4° C., the seeds were placed on plates containing wet filter paper and allowed to germinate at 24° C. in the dark. Root tips (the terminal 5 mm) and the remainder of the root were collected from three-day old seedlings. Total RNA was extracted from both tissues using the RNeasy Plant Mini Kit from Qiagen. For the mRNA isolation we used the Dynabeads mRNA Purification Kit from Dynal. Synthesis of complementary DNA from both mRNA populations and library construction was performed with the SMART PCR cDNA Synthesis Kit and the PCR-Select cDNA Subtraction Kit from Clontech, according to the manufacture instruction. The root-tip- derived cDNA was used as “tester” and the rest-root-derived cDNA as “driver”. The amplified cDNA fragments were purified and tailed (200 mM dATP) with Taq DNA polymerase for 30 min at 70° C. Four identical ligation reaction were performed using 1 ml (60 ng) of the tailed cDNA and 0.5 ml (25 ng) pGEM-T vector (Promega) for 16 h at 4° C. with 3 units of ligase (Promega). The reactions were pooled and the mix was purified using the PCR Purification Kit from Qiagen. 1 ml (2 ng) of the purified ligation mix was used to transform 10 ml of ELECTROMAX DH10B competent cells (Gibco BRL) by electroporation (Calvin NM, et al. J Bacteriol. 170(6), 2796–2801, 1988).

ort1f: WT (S75) root-tip specific 1-ZAP cDNA library.

Harvest of root-tip material, total RNA extraction and mRNA isolation was performed as previously described for the construction of the ort1s library. Complementary DNA was synthesized with the ZAP-cDNA Synthesis Kit from Stratagene and the library was cloned into the ZAP Express Vector according to the manufacturer's instructions. Plasmid clones were generated from each positive phage plaque by in vivo excision, using ExAssist helper phage and E. coli strain SOLR, according to the ZAP-cDNA Gigapack III Gold Cloning Kit manual (Stratagene).

Example 2

Isolation of a Partial cDNA Predicted to Encode β-amyrin Synthase from Avena strigosa Using a cDNA Probe Predicted to Encode an Oxidosqualene Cyclase Identified by DNA Sequence Analysis of the Ort1s cDNA Library

DNA sequence analysis of the ort1s cDNA library identified a partial cDNA sequence (clone ort1s.pk001.c14) with homology to oxidosqualene cyclases (see Annex III).

This clone was used as a probe to screen the oat root full length cDNA library ort1f. Approximately 450,000 phage plaques were plated on 6 NZY plates (5 g/lt NaCl, 2 g/lt MgSO₄.7H₂O, 5 g/lt yeast extract, 10 g/lt casein hydrolysate and 15g/lt agar, pH7.5) according to standard procedures (Sambrook J et al. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989) and plaque lifting was performed on Hybond-N+ membranes (Amersham) following the manufacturer's instructions. The ort1s.pk001.c14 clone was cut with KpnI/PstI and the 300 bp insert was purified with the QIAquick Gel Extraction Kit from Qiagen. 5 ml (100 ng) of the insert was labeled with ³²P by “random priming” using the Oligolabelling Kit from Pharmacia and purified using a Pharmacia Sephadex G-50 NICK column. Hybridization was performed in Church buffer (500 mM phosphate buffer, 7% SDS, 1 mM EDTA and 1% BSA) for 16 h at 65° C. The filters were washed 3 times for 15 min with 40 mM phosphate buffer, 5% SDS and 1 mM EDTA at 65° C., followed by 3 washes for 15 min with 40 mM phosphate buffer, 2% SDS and 1 mM EDTA at 65° C. and three washes for 15 min at 60° C. with 40 mM phosphate buffer, 1% SDS and 1 mM EDTA. Over 300 positive phage clones were identified from the first screening. A second and third screen was carried out with 12 hybridizing clones (giving different strength of hybridization signals) following the same hybridization and wash conditions as outlined above. Out of these 12 clones, 8 were confirmed as positives following the second and third rounds of screening. Plasmid clones were generated from each positive phage plaque by in vivo excision, using ExAssist helper phage and E. coli strain SOLR, according to the ZAP-cDNA Gigapack III Gold Cloning Kit manual (Stratagene).

DNA sequence analysis of the corresponding inserts was performed with the ABI PRISM 377XL DNA Sequencer using the BigDye Terminator Kit (Perking Elmer) and T3 and T7 primers. DNA sequence analysis of the 5′ and 3′ ends of these clones indicated that 6 of the 8 positive phage clones contained the predicted full-length sequence of the putative oxidosqualene cyclase from A. strigosa, and that the clones shared identity at the 5′ (388 bp) and 3′ (328 bp) ends. These clones were named amy7As, amy10As, amy11As, amy15As, amy16As and amy23As (see Annexes IV and V).

In order to obtain the complete sequence of the full length oxidosqualene cyclase cDNA we used 4 forward primers from different regions of the cDNA.

ASEQ1: GCATTGGCCTGGTGATTA, (SEQ ID NO: 1)

ASEQ2 : GCCCATGAGGTGGTCACA, (SEQ ID NO: 2)

ASEQ3 : GGCGCGTGATTATTGTGC, (SEQ ID NO: 3)

ASEQ4 : CATGAACTCGTGCGCCTT. (SEQ ID NO: 4)

The cDNA was also digested with different restriction enzymes. The following restriction fragments were ligated individually into the pBluescript SK+ vector (Stratagene): two EcoRI fragments of 1100 bp and 200 bp respectively; one EcoRI/XbaI fragment of 1300 bp; one HindIII/XbaI fragment of 2300 bp and one HindIII/BamHI fragment of 300 bp.

These five subclones were also used to determine the complete sequence of the putative oxidosqualene cyclase cDNA (Annex I) together with part of the untranslated 3′ and 5′ sequences. The initiation and termination of translation points are shadowed.

Primary structure analysis was carried out with the ProtParam Tool at the “Expert Protein Analysis System” (ExPASy) proteomics server of the Swiss Institute of Bioinformatics (SIB). The deduced amino acid sequence revealed a protein of 757 amino acids, a predicted molecular weight (MW) of 86860.8 dalton and a theoretical pI of 5.99 (Annex II).

Computational analysis (ScanProsite Tool, ExPASy) revealed the existence of 4 conserved QW motifs with the consensus structure [K/R][G/A]X2-3[F/Y/W][L/I/V]X3QX2-5GXW (Poralla et al. TIBS 19, 157–158, 1994), which is characteristic for oxidosqualene cyclases.

The characteristic consensus sequence DCTAE (underlined) of the putative active site of the enzyme is also present in a similar position to that of other plant triterpene cyclases in the databanks (Abe I and Prestwich GD, Lipids 30, no.3, 1995). The Annexes below give the complete deduced amino acid sequence of the A. strigosa oxidosqualene cyclase cDNA, the alignment of the DCTAE motif in the predicted amino acid sequences of triterpene biosynthetic enzymes of different species, and the alignment of the predicted oxidosqualene cyclase with artenol synthase of A. strigosa.

Example 3 Further Evidence that the cDNA Does Encode β-amyrin Synthase

1) Expression in Yeast

The oxidosqualene cyclase cDNA was demonstrated to encode β-amyrin synthase by expression in yeast (Saccharomyces cerevisiae). Two complete cDNA clones (amy10As and amy15As), containing different lengths of the 5′ untranslated leader sequence were digested with BamHI/XbaI. The inserts were purified with the QlAquick Gel Extraction Kit from Qiagen and ligated into the BamHI/XbaI site of the pYES2 expression vector (Invitrogen) downstream of the Gall promoter. These clones were transformed into the GIL77 (Gollub EG et al. J Biol Chem 252, 2846–2854, 1977) yeast mutant strain by the lithium acetate method (Rose MD et al. Cold Spring Harbor Laboratory Press, New York, N.Y., 1990). The GIL77 mutant strain is an ergosterol auxotroph and lacks lanosterol synthase activity. Therefore it accumulates 2,3-oxidosqualene. Selection of the transformants was done on SC-U medium [1.7 g/lt Yeast Nitrogen Base (Difco), 5 g/lt (NH₄)2SO₄ (SIGMA), 20 g/lt raffinose (BDH), 0.77 g/lt Uracil Drop Out Supplement (Clontech), 20 mg/ml ergosterol (SIGMA),13 mg/ml hemin (SIGMA) and 5 mg/ml Tween 80 (SIGMA)] at 30° C. Transformants were grown in 50 ml complete YPD liquid medium (Clontech) supplemented with ergosterol (20 mg/ml), hemin (13 mg/ml) and Tween 80 (5 mg/ml) at 30° C. for 60 h. The cells were centrifuged and resuspended in 50 ml SC-U liquid medium containing instead of raffinose, 2% galactose (BDH). The culture was grown for 24 h at 30° C. with shaking (200 rpm). Cells were collected by centrifugation and refluxed with 3 ml 20% KOH/50% ethanol for 10 min at 95° C. The solution was extracted twice with hexane and the supernatant was freeze-dried in a spin-vacuum. The pellet was resuspended in 120 ml of HPLC-grade methanol (BDH) and 20 ml were analyzed on a normal phase silica gel TLC plate (MERCK) which was developed with 1:1 hexane/ethylacetate (BDH). The TLC plate was visualized by anisaldehyde staining (96% acetic acid, 2% sulphuric acid and 2% p-anisaldehyde, SIGMA) after incubating the plate for 5 min at 160° C. (Saponins, K Hostettmann and A Marston (eds.), Cambridge University Press, 1995). In order to confirm the Rf value of the products, 10 mg of β-amyrin and cycloartenol (Apin Chemicals Ltd) standards were also loaded on the TLC. As a positive and negative control we used pYES2 constructs harboring the β-amyrin or cycloartenol synthase cDNAs from Panax ginseng respectively (Kushiro et al. Eur J Biochem 256, 238–244, 1998). Both putative oxidosqualene cyclase cDNAs (amy10As and amy15As) were successfully expressed in yeast and confirmed to encode A. strigosa β-amyrin synthase.

2) Expression in Sad Mutants

Total RNA was extracted from root tissue of 9 saponin-deficient A. strigosa mutants (numbers 610, 109, 1027, 791, 825, 616, 376, 1139 and 9; Proc Natl Acad Sci 96, 12923–12928, 1999) using the RNeasy Plant Mini Kit from Qiagen. The RNA was analyzed on a 1.2% formaldehyde agarose gel and transferred on Hybond-N+ membrane (Amersham) with 20×SSC (3M NaCl, 0.3M Sodium citrate, BDH) by standard procedures (Sambrook J et al. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1989). The amy10As clone was cut with BamHI/XbaI and the insert was purified with the QIAquick Gel Extraction Kit from Qiagen. 4 ml (100 ng) of the insert was labeled with 32P by “random priming” using the Oligolabelling Kit from Pharmacia and purified using a Pharmacia Sephadex G-50 NICK column. Hybridization was performed in Church buffer (500 mM phosphate buffer, 7% SDS, 1 mM EDTA and 1% BSA) for 16 h at 65° C. The filters were washed 3 times for 15 min with 40 mM phosphate buffer, 5% SDS and 1 mM EDTA at 65° C., followed by 3 washes for 15 min with 40 mM phosphate buffer, 2% SDS and 1 mM EDTA at 65° C. and one wash for 15 min at RT with 40 mM phosphate buffer, 1% SDS and 1 mM EDTA. Mutants 610 and 109 represent different mutant alleles at the Sad1 locus, and both lack detectable levels of β-amyrin synthase. The results of the northern blot analysis showed that the β-amyrin synthase mRNA levels are substantially reduced in sad1 mutants, but not in the other sad mutants (all of which have β-amyrin synthase activity).

Example 4 Isolation of the Genomic Aclone of the β-amyrin Synthase from A. strigosa.

In order to clone the full genomic sequence of the β-amyrin synthase from Avena strigosa a primer pair was designed, based on the cDNA sequence.

Primer AMYstaF5 (5′-CCATGTGGAGGCTAACAATAGGTGAGGG-3′; SEQ ID NO: 5) containing the initiation of translation start codon from the 5′ end and, primer ANYendR5 (5′-TATTTCTCAAAGAAAATAACGCATGAATGCTC; SEQ ID NO:6) from the untranslated 3′ end were used.

Approximately 400 ng of high molecular weight genomic DNA were used as template in a 50 μl reaction with 350 μM dNTPs, 300 nM of each primer and 2.5 units of Expand™ Long Template polymerase (Roche). Reaction conditions were 94° C. for 2min, 94° C. for 10 sec, 65° C. for 30 sec, 68° C. for 6 min for a total of 9 cycles, following by 94° C. for 10 sec 65° C. for 30 sec, 68° C. for 6 min+20 sec/cycle for a total of 19 cycles and a final cycle of 68° C. for 7 min. A single band of about 7.4 Kb was excised from the gel and purified with the Gel Extraction Kit from Qiagen. The purified band was tailed for 30 min at 70° C. in a 50 μl reaction using 200 μM of dATP and 5 units of Taq DNA polymerase (Gibco-BRL). For the ligation reaction 2 μl of the tailed band were mixed with 1 μl of pGEM-T vector (Promega) in a final volume of 10 μl. Ligation was performed with 3 units of T4 DNA ligase (Promega) O/N at 4° C. The ligation mix was purified with the PCR Purification Kit from Qiagen and 1 μl was used to transform 10 μl of DH10B cells (Gibco-BRL) according to the manufacture instructions. Two independent clones were subjected to automated sequencing using the ABI PRISM 377XL DNA Sequencer and the BigDye Terminator Kit (Perking Elmer). The primers shown in Table 3 were used in order to cover the complete genomic sequence at least two times.

Example 5 Isolation of the Promoter of β-amyrin Synthase from A. Strigosa

In order to isolate the sequence of the β-amyrin synthase regulating gene expression, we used a modified procedure of the “Extender PCR” (A J H Brown et al.) and the “GenomeWalker” (Clontech) protocol. Initially a forward primer:

ADPR-ApaF : 5′-TGCGAGTAAGGATCCTCACGCAAGGAATTCCGACCAGACAGGCC-3′ (SEQ ID NO: 7) and a reverse primer ADPR-ApaR : 3′-H2N -CTGTCCGG-PO4-5′ were designed in order to generate a synthetic adaptor.

Approximately 6 nmoles of each primer were mixed with 10 μl of 10X ligation buffer (Roche) in a 100 μl final reaction volume. The primers were annealed in a PTC-200 Thermal Cycler (MJ-Research) using the following conditions : 950° C. for 3 min, 93.50° C. for 1 min −1.5° C./cycle for a total of 48 cycles.

In order to generate the “libraries” for the PCR amplification, 5–6 μg of high molecular weight genomic DNA from A. strigosa were digested with the DraI, EcoRV, PvuII, ScaI and StuI restriction enzymes (Roche). The reactions were performed in a final volume of 100 μl with 80 units of each enzyme for 16 h at 37° C. After digestion the DNA from each “library” was purified using the PCR Purification Kit (Qiagen) and resuspended in a final volume of 28 μl of Tris-HCl pH8.5. 4 μl (600 ng) of DNA from each library was mixed with 0.8 μl (50 pmol) of the synthetic adaptor in a final volume of 8 μl. Ligation was performed overnight (16 h) at 16° C. with 5 units of T4 DNA ligase (Roche). The adaptor ligated DNA from each library was diluted with 72 μl of sterile dH₂O and stored at −20° C.

Amplification of target sequence from each “library” was carried out using 10 ng of adaptor-ligated template DNA, 20 pmol of adaptor primer:

PR11F (5′-TGCGAGTAAGGATCCTCACGCAAG-3′; SEQ ID NO: 8) combined with an β-amyrin-specific primer:

ANY8R (5′-TCAGCCACGGACCGCCGCCCTCACCTATT-3′ SEQ ID NO: 9), 200 pM of dNTPs and 1 unit of Expand High Fidelity DNA polymerase (Roche). Reaction conditions were 940° C. for 3 min, 94° C. for 25s, 720° C. for 3 min for a total of 7 cycles following by 94° C. for 25s, 67° C. for 3 min for a total of 31 cycles and a final cycle of 67° C. for 7 min. A 1/50 dilution of each PCR reaction was used in order to perform a second round of amplification using the nested adaptor primer: PR22F (5′-CACGCAAGGAATTCCGACCAGACA-3′; SEQ ID NO: 10), and a second β-amyrin-specific primer, AMY9R (5′-TTAGCCTCCACATGGTGCGCACCAACAACG-3′; SEQ ID NO: 11).

Reaction conditions were 94° C. for 3 min, 94° C. for 25s, 72° C. for 3 min for a total of 5 cycles following by 94° C. for 25s, 67° C. for 3 min for a total of 20 cycles and a final cycle of 67° C. for 7 min. All PCR products were subsequently size fractionated on a 1% agarose gel. Three single bands of 1.4 Kb, 1.0 Kb and 0.7 Kb were excised from the DraI, EcoRV and StuI “library” respectively and purified with the Gel Extraction Kit from Qiagen. The DNA fragments were tailed for 30 min at 70° C. in a 50_l reaction using 200 μM of dATP and 5 units of Taq DNA polymerase (Gibco-BRL). For the ligation reaction 2 μl of the tailed band were mixed with 1_l of pGEM-T vector (Promega) in a final volume of 10 μl. Ligation was performed with 3 units of T4 DNA ligase (Promega) for 2 h at RT. The ligation mix was purified with the PCR Purification Kit from Qiagen and 1 μl was used to transform 10 μl of DH10B cells (Gibco-BRL) according to the manufacture's instructions. Two independent clones from each band were subjected to automated sequencing using the ABI PRISM 377XL DNA Sequencer and the BigDye Terminator Kit (Perking Elmer). The 76 bp of sequence directly upstream of the β-amyrin-specific AMY9R primer site was identical to that predicted from the 5′ end of the A. strigosa β-amyrin cDNA sequence in all three clones. This sequence was followed by approximately 1.25 kb of novel sequence in the DraI fragment, 0.85 kb in the EcoRV fragment and 0.55 kb in the StuI fragment. All fragments included a TATA box sequence 92 bp upstream of the initiation of translation start codon (ATG), indicating that this was likely to be the promoter region of the gene.

In order to isolate more novel sequences of the regulatory region of the β-amyrin synthase gene a second upstream walk was carried out using the same adaptor-ligated “libraries”.

Two new β-amyrin-specific primers, AMYPRO10R (5′-GGACATCGAGGTGGTTGCATTTTAGTGGATC-3′; SEQ ID NO: 12) and AMYPR011R (5′-GGTGCTCTTGCTTGTTCTATGGCCTCGTCTTT-3′; SEQ ID NO: 13) were designed based on the isolated 1. 25 kb promoter sequence. For the first and second PCR amplification primer PR11F in combination with primer AMYPROlOR and primer PR22F in combination with primer AMYPR011R were used respectively.

Reaction conditions were identical to that described earlier for the first upstream walk. This generated a single band product of approximately 950 bp in the EcoRV “library”. The band was excised from the gel and purified with the Gel Extraction Kit from Qiagen. After the standard tailing reaction (described earlier) the fragment was ligated into the pGEM-T vector (Promega). The ligation mix was purified with the PCR Purification Kit from Qiagen and 1 μl was used to transform 10 μl of DH10B cells (Gibco-BRL) according to the manufacture's instructions. Two independent clones were subjected to automated sequencing using the ABI PRISM 377XL DNA Sequencer and the BigDye Terminator Kit (Perking Elmer). The 200 bp of sequence directly upstream of the β-amyrin-specific AMY11R primer site was identical to that from the 1.25 kb DraI fragment isolated in the first walk, followed by 720 bp of novel sequence. This extended the 5′ end of the gene by an additional 0.7 kb to give a total of approximately 1.9 kb of sequence upstream of the transcriptional start site.

Computational analysis of the promoter region using the MatInspector program at the Genomatix web page in Germany revealed several putative transcription factor binding sites (see Annex). Parameters used for the search were: Plant databank-section, Core similarity 0.75 and Matrix similarity 0.85. The resulting statistics were:

In 0 seq. 0 matches to P$AG_01 (Agamous) In 1 seq. 5 matches to P$ATHB1_01 (Arabidopsis thaliana homeo box protein 1) In 1 seq. 19 matches to P$DOF1_01 (Dof1/MNB1a-single zinc finger transcription factor) In 1 seq. 8 matches to P$GAMYB_01 (GA-regulated myb gene from barley) In 1 seq. 2 matches to P$GBP_Q6 (G-box binding proteins) In 1 seq. 1 matches to P$MYBPH3_01 (Myb-like protein of Petunia hybrida) In 0 seq. 0 matches to P$02_01 (Opaque-2) In 1 seq. 19 matches to P$PBF_01 (PBF (MPBF)) In 1 seq. 4 matches to P$P_01 (maize activator P of flavonoid biosynthetic genes) In 1 seq. 5 matches to P$SBF1_01 (SBF-1) In 0 seq. 0 matches to P$bZIP910_01 (bZIP transcription factor from Antirrhinum majus)

REFERENCES

Brown A J H, Perry S J, Saunders S E and Burke J F (1999) Extender PCR: A method for the isolation of sequences regulating gene expression from genomic DNA. BioTechniques 26 (5), 804–806.

Quandt K, Frech K, Karas H, Wingender E and Werner T (1995) MatInd and MatInspector—New fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Research 23, 4878–4884.

Annexes—Sequences & BLAST results (I) OAT β-amyrin synthase cDNA (nucleotide sequence) SEQ ID NO: 14 1        10        20        30        40        50        60        70 ATTGCTTGTTTTCCTCGCATACACTGCCCGTTGTTGGTGCGCACCATGTGGAGGCTAACAATAGGTGAGGG CGGCGGTCCGTGGCTGAAGTCGAACAATGGCTTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCG GCACGCCGGAAGAGCGTGCCGAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCAGAGGAAG GAGTCACAGGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGACAGA AGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGATGCGAGCTTTACATC AATATTCCTCTCTACAAGCAGACGATGGGCATTGGCCTGGTGATTACAGTGGGATTCTCTTCATTATGCCT ATCATTATATTCTCTTTATATGTTACTAGATCACTTGACACCTTTTTATCTCCGGAACATCGTCATGAGAT ATGTCGCTACATTTACAATCAACAGAATGAAGATGGTGGTTGGGGAAAAATGGTTCTTGGCCCAAGTACCA TGTTTGGATCGTGTATGAATTATGCAACCTTAATGATTCTTGGCGAGAAGCGAAATGGTGATCATAAGGAT GCATTGGAAAAAGGGCGTTCTTGGATTTTATCTCATGGAACTGCAACTGCAATACCACAGTGGGGAAAAAT ATGGTTGTCGATAATTGGCGTTTACGAATGGTCAGGAAACAATCCTATTATACCTGAATTGTGGTTGGTTC CACATTTTCTTCCGATTCACCCAGGTCGTTTTTGGTGTTTTACCCGGTTGATATACATGTCAATGGCATAT CTCTATGGTAAGAAATTTGTTGGGCCTATTAGTCCTACAATATTAGCTCTGCGACAAGACCTCTATAGTAT ACCTTACTGCAACATTAATTGGGACAAGGCGCGTGATTATTGTGCAAAGGAGGACCTTCATTACCCACGCT CACGGGCACAAGATCTTATATCTGGTTGCCTAACGAAAATTGTGGAGCCAATTTTGAATTGGTGGCCAGCA AACAAGCTAAGAGATAGAGCTTTAACTAACCTCATGGAGCATATCCATTATGACGACGAATCAACCAAATA TGTGGGCATTTGCCCTATTAACAAGGCATTGAACATGATTTGTTGTTGGGTAGAAAACCCAAATTCGCCTG AATTCCAACAACATCTTCCACGATTCCATGACTATTTGTGGATGGCGGAGGATGGAATGAAGGCACAGGTA TATGATGGATGTCATAGCTGGGAACTAGCGTTCATAATTCATGCCTATTGTTCCACGGATCTTAGTAGCGA GTTTATCCCGACTCTAAAAAAGGCGCACGAGTTCATGAAGAACTCACAGGTTCTTTTCAACAACCCAAATC ATGAAAGCTATTATCGCCACAGATCAAAAGGCTCATGGACCCTTTCAAGTGTAGATAATGGTTGGTCTGTA TCTGATTGTACTGCGGAAGCTGTTAAGGCATTGCTACTATTATCAAAGATATCCGCTGACCTTGTTGGCGA TCCAATAAAACAAGACAGGTTGTATGATGCCATTGATTGCATCCTATCTTTCATGAATACAGATGGAACAT TTTCTACCTACGAATGCAAACGGACATTCGCTTGGTTAGAGGTTCTCAACCCTTCTGAGAGTTTTCGGAAC ATTGTCGTGGACTATCCATCTGTTGAATGCACATCATCTGTGGTTGATGCTCTCATATTATTTAAAGAGAC GAATCCACGATATCGAAGAGCAGAGATAGATAAATGCATTGAAGAAGCTGTTGTATTTATTGAGAACAGTC AAAATAAGGATGGTTCATGGTATGGCTCATGGGGTATATGTTTCGCATATGGATGCATGTTTGCAGTAAGG GCGTTGGTTGCTACAGGAAAAACCTACGACAATTGTGCTTCTATCAGGAAATCATGCAAATTTGTCTTATC AAAGCAACAAACAACAGGTGGATGGGGTGAAGACTATCTTTCTAGTGACAATGGGGAATATATTGATAGCG GTAGGCCTAATGCTGTGACCACCTCATGGGCAATGTTGGCTTTAATTTATGCTGGACAGGTTGAACGTGAC CCAGTACCACTGTATAATGCTGCAAGACAGCTAATGAATATGCAGCTAGAAACAGGTGACTTCCCCCAACA GGAACACATGGGTTGCTTCAACTCCTCCTTGAACTTCAACTACGCCAACTACCGCAATCTATCCCGATTAA TGGCTCTTGGGGAACTTCGCCGTCGACTTCTTGCGATTAAGAGCTGATATGGAAACAAACATGGATGTCTA GGCTGCGAGGAATAAGAACATTGCTCCCACGAGCATTCATGCGTTATTTTCTTTGAGAAATAAGTTCTCTT CCTACCGATGTCATCATGTAACTTTTCGGAATATTTTATGTGT

(II)—OAT β-amyrin synthase cDNA (amino acid sequence) SEQ ID NO: 15 MWRLTIGEGGGPWLKSNNGFLGRQVWEYDADAGTPEERAEVERVRAEFTKNRFQRKESQDLLLRLQYAKDNPLPA NIPTEAKLEKSTEVTHETIYESLMRALHQYSSLQADDGHWPGDYSGILFIMPINIFSLYVTRSLDTFLSPEHRHE ICRYIYNQQNEDGGWGKMVLGPSTMFGSCMNYATLMILGEKRNGDHKDALEKGRSWILSHGTATAIPQWGKIWLS IIGVYEWSGNNPIIPELWLVPHFLPIHPGRFWCFTRLIYMSMAYLYGKKFVGPISPTILALRQDLYSIPYCNINW DKARDYCAKEDLHYPRSRAQSDLISGCLTKIVEPILNWWPANKLRDRALTNLMEHIHYDESTKVVGICPINKALN MICCWVENPNSPEFQQHLPRFHDYLWMAEDGMKAQVYDGCHSWELAFIIHAYCSTDLTSEFIPTLKKAHEFMKNS QVLFNHPNHESYYRHRSKGSWTLSSVDNGWSVSDCTAAEAVKALLLLSKISADLVGDPIKQDRLYDAIDCILSFMN TDGTFSTYECKRTFAWLEVLNPSESFRNIVVDYPSVECTSSVVDALILFKETNPRYRRAEIDKCIEEAVVFIENS QNKDGSWYGSWGICFAYGCMFAVRALVATGKTYDNCASIRKSCKFVLSKQQTTGGWGEDYLSSDNGEYIDSGRPN AVTTSWAMLALIYAGQVERDPVPLYNAARQLMNMQLETGDFPQQEHMGCFNSSLNFNYANYRNLYPIMALGELRR RLLAIKS

(III)—ortls.pk001.c14 (putative oxiclosqualene cyclase EST clone from Avena strigosa) (SEQ ID NO: 16) ACAGAGGTCACTCACGAGACTATCTACGAATCATTGATGCGAGCTTTACA TCAATATTCCTCTCTACAAGCAGACGATGGGCATTGGCCTGGTGATTACA GTGGGATTCTCTTCATTATGCCTATCATTATATTCTCTTTATATGTTACT AGATCACTTGACACCTTTTTATCTCCGGAACATCGTCATGAGATATGTCG CTACATTTACAACCAACAGAATGAAGATGGTGGTTGGGGAAAAATGGTTC TTGGCCCAACGT

(IV)—CLUSTAL W (1.8) multiple sequence alignment of 6 clones encoding the 5′ end of the putative oxidosgualene cyclase from A. strigosa (amy16as: SEQ ID NO: 17; amy23as: SEQ ID NO: 18; amy7As: SEQ ID NO: 19; amy11As: SEQ ID NO: 20; amy10As: SEQ ID NO: 21; amy15As: SEQ ID NO: 22) amy16As ACCATGTGGAGGCTAACAATAGGTGAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 amy23As ACCATGTGGAGGCTAACAATAGGTGAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 amy7As ACCATGTGGAGGCTAACAATAGGTCAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 amy11As ACCATGTGGAGGCTAACAATAGGTGAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 amy10As ACCATGTGGAGGCTAACAATAGGTGAGGNCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 amy15As ACCATGTGGAGGCTAACAATAGGTGAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGC 60 *************************** ******************************** amy16As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 amy23As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 amy7As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 amy11As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 amy10As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 amy15As TTCCTTGGCCGCCAAGTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCC 120 ************************************ *********************** amy16As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCA-GAGGAAGGAGTCACA 179 axny23As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCA-GAGGAAGGAGTCACA 179 amy7As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCA-GAGGNAGGAGTCACA 179 amy11As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAAcAGCTTCCA-GAGGAAGGAGTCACA 179 amy10As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCA-GAGGAAGGAGTCACA 179 amy15As GAGGTTGAGAGGGTGCGTGCGGAATTCACAAAGAACAGGTTCCNAGAGGAAGGAGTCACA 180 *******************************************  **** ********** amy16As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 239 axny23As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 239 amy7As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 239 amy11As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 239 amy10As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 239 amy15As GGACCTTCTTCTACGCTTGCAGTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGAC 240 ********** ************************************************* amy16As NGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 299 amy23As AGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 299 amy7As AGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 299 amy11As AGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 299 amy10As AGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 299 amy15As AGAAGCCAAGCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGAT 300 *********************************************************** amy16As GCGAGCTCTACATCAATATTCCTCTTTACAAGCAGACGATGGGCATTGGCCTGGTGATTA 359 amy23As GCGAGCTCTACATCAATATTCCTCTTTACAAGCAGACGATNGGCATTGGCCTGGTGATTA 359 amy7As GCGAGCTCTACATNAATATTCCTCTTTACAAGCAGACGATGGGCATTGGCCTGGTGATTA 359 amy11As GCGAGCTCTACATCAATATTCCTCTTTACAAGCAGACGATGGGCATTGGCCTGGTGATTA 359 amy10As GCGAGCTCTACANCAATATTCCTCTTTACAAGCAGACGATGGGCATTGGCCTGGTGATTA 359 amy15As GCGAGCTCTACATCAATATTCCTCTTTACAAGCAGACGATGGGCATTGGCCTGGTGATTA 360 ************  ************************** ******************* amy16As CAGTGGGATTCTCTTCATTATGCCTATCA 388 amy23As CAGTGGGATTCTCTTCATTATGCCTATCA 388 amy7As CAGTGGGATTCTCTTCATTATGCCTATCA 388 amy11As CAGTGGGATTCTCTTCATTATGCCTATCA 388 amy10As CAGTGGGATTCTCTTCATTATGCCTATCA 388 amy15As CAGTGGGATTCTCTTCATTATGCCTATCA 389 *****************************

(V)—CLUSTAL W (1.8) multiple sequence alignment of 6 clones encoding the 3′ end of the putative oxidosqualene cyclase from A. strigosa (amy16as: SEQ ID NO: 23; amy23as: SEQ ID NO: 24; amy7As: SEQ ID NO: 25; amy11As: SEQ ID NO: 26; amy10As: SEQ ID NO: 27; amy15As: SEQ ID NO: 28) amy11As CATAAAATATTCCGAAAAGTTACATGATGACATCGGTAGGAAGAGAACTTATTTCTCAAA 60 amy15As CATAAAATATTCCGAAAAGTTACATGATGACATCGGTAGGAAGAGAACTTATTTCTCAAA 60 amy23As CATAAAATATTCCGAAAAGTTACATGATGACATCGGTAGGAAGAGAACTTATTTCTCAAA 60 amy10As CATAAAATATTCCGAAAAGTTACATGATGACATCGNTAGGAAGAGAACTTATTTCTCAAA 60 amy7As CATAAAATATTCCGAAAAGTTACATGATGACATCGGTAGGAAGAGAACTTATTTCTCAAA 60 amy16As CATAAAATATTCCGAAAAGTTACATGATGACATCGGTAGGAAGAGAACTTATTTCTCAAA 60 *********************************** ************************ amy11As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 amy15As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 amy23As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 amy10As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 amy7As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 amy16As GAAAATAACGCATGAATGCTCGTGGGAGCAATGTTCTTATTCCTCGCAGCCTAGACATCC 120 ************************************ *************** **** ** amy11As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGNCGACGGCGAAGTTCCCCAAGNGC 180 amy15As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGNCGACGGCGAAGTTCCCCAAGAGC 180 amy23As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGTCGACGGCGAAGTTCCCCAAGAGC 180 amy10As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGTCGACGGCGAAGTTCCCCAAGAGC 180 amy7As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGTCGACGGCGAAGTTCCNCAAGAGC 180 amy16As ATGTTTGTTTCCATATCAGCTCTTAATCGCAAGAAGTCGACGGCGAAGTTCCCCAAGAGC 180 ************************************ *************** **** ** amy11As CATAA-TCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGAGG-AGTTGAAG 238 amy15As CATAA-TCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGAGG-AGTTGAAG 238 amy23As CATANATCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGAGG-AGTTGAAG 239 amy10As CATAA-TCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGAGG-AGTTGAAG 238 amy7As CATAA-TCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGAGG-AGTTGAAG 238 amy16As CATNA-TCGGTATAGATTGCGGTAGTTGGCGTAGTTGAAGTTCAAGGANGGAGTTGAAG 239 amy11As CAACCCATGTGTTCC-TGTTGGGGGAAGTCACCTGTTTCTAGCTGCATATTCATTAGCTG 297 amy15As CAACCCATGTGTTCC-TGTTGGGGGAAGTCACCTGTTTCTAGCTGCATATTCATTAGCTG 297 amy23As CAACCCATGTGTTNCCTGTTGGGGGAAGTCACCTGTTTCTAGCTGCATATTCATTAGCTG 299 amy10As CAACCCATGTNTTCC-TGTTGGGGGAAGTCACCTGTTTCTAGCTGCATATTCATTAGCTG 297 amy7As CAACCCATGTGTTCC-TGTTGGGGGAAGTCACCTGTTTCTAGCTGCATATTCATTAGCTG 297 amy16As CAACCCATGTGTTCC-TGTTGGGGGAAGTCACCTGTTTCTAGCTCCATATTCATTAGCTG 298 amy11As TCTTGCAGCATTATACAGGGGTACTGGGTCA 328 amy15As TCTTGCAGCATTATACAGGGGTACTGGGTCA 328 amy23As TCTTGCAGCATTATACAGGGGTACTGGGTCA 330 amy10As TCTTGCAGCATTATACAGGGGTACTGGGTCA 328 amy7As TCTTGCAGCATTATACAGGGGTACTGGGTCA 328 amy16As TCTTGCAGCATTATACAGGGGTACTGGGTCA 329

(VI)—Alignment of the DCTAE motif in the pre- dicted amino acid sequences of triterpene bio- synthetic enzymes of different species (RAT OSC: SEQ ID NO: 29; YEAST OSC: SEQ ID NO: 30; ARAB OSC: SEQ ID NO: 31; OAT CYC: SEQ ID NO 31; PANAX CYC: SEQ ID NO: 32; OAT AMY: SEQ ID NO: 33; PANAX AMY: SEQ ID NO: 34. RAT OSC HKGGFPFSTLDGGWIVADDTAEALKAVLLL YEAST OSC 439 RKGAWGFSTKTQGYTVADCTAEAIKAIIMV ARAB OSC 466 SKGAWPFSTADHGWPISDCTAEGLKAALLL OAT CYC 467 SKGAWPFSTADHGWPISDCTAEGLKAALLL PANAX CYG 466 SKGAWPFSTADHGWPISDCTAEGFKAVLQL OAT AMY 467 SKGSWTLSSVDNGWSVSDCTAEAVKALLLL PANAX AMY 469 SKGSWTFSDQDHGWQVSDCTAEGLKCGLIF

(VII)—Alignment of the predicted oxidosgualene cyclase with cycloartenol synthase of A. strigosa SEQ ID NOS: 15 and 35. b-amyrin MWRLTIGEGGG-PWLKSNNGFLGRQVWEYDADAGTPEERAEVERVRAEFTKNRFQRKESQ 59 cycloart MWRLKIAEGGGDPWLRTKNAHVGRQVWEFDPEAGDPEALAAVEAARRDFAAGRHRLKHSS 60 b-amyrin DLLLRLQYAKDNPLPANIPTEAKLEKSTEVTHETIYESLMRALHQYSSLQADDGHWPGDY 119 cycloart DRLMRIQFEKENPLKLDLP-AIKLEENEDVTEEAVSTSLKRAISRFSTLQAHDGHWPGDY 119 b-amyrin SGILFIMPINIFSLYVTRSLDTFLSPEHRHEICRYIYNQQNEDGGWGKMVLGPSTMFGSC 179 cycloart GGPMFLMPGLLITLYVTGSLNTVLSPEHQKEIRRYLYNHQNEDGGWGLHIEGPSTMFGSA 179 b-amyrin MNYATLMILGEKRNGDHKDALEKGRSWILSHGTATAIPQWGKIWLSIIGVYEWSGNNPII 239 cycloart LTYVSLRLLGEGPES-GDGAMEKGRNWILDHGGATYITSWGKFWLAVLGVFDWSGNNPLP 238 b-amyrin PELWLVPHFLPIHPGRFWCFTRLIYMSMAYLYGKKFVGPISPTILALRQDLYSIPYCNIN 299 cycloart PEIWMLPYRLPIHPGRMWCHCRMVYLPMCYVYGKRFVGKITPLILELRNELYKTPYSKID 298 b-amyrin WDKARDYCAKEDLHYPRSRAQDLISGCLTKIVEPILNWWPANKLRDRALTNLMEHIHYDD 359 cycloart WDSARNLCAKEDLYYPHPLIQDILWATLHKFVEPVMMHWPGNKLREKALNHVMQHVHYED 358 b-amyrin ESTKYVGICPINKALNMICCWVENPNSPEFQQHLPRFHDYLWMAEDGMKAQVYDGCHSWE 419 cycloart ENTRYICIGPVNKVLNMLTCWIEDPNSEAFKLHIPRVHDYLWVAEDGMKMQGYNGSQLWD 418 b-amyrin LAFIIHAYCSTDLTSEFIPTLKKAHEFMKNSQVLFNHP-NHESYYRHRSKGSWTLSSVDN 478 cycloart TAFAVQAITATGLIDEFAPTLKLAHNFIKNSQVLDDCPGDLSYWYRHISKGAWPFSTADH 478 b-amyrin GWSVSDCTAEAVKALLLLSKISADLVGDPIKQDRLYDAIDCILSFMNTDGTFSTYECKRT 538 cycloart GWPISDCTAEGLKAALLLSKISPEIVGEPVEVNRLYDAVNCLMSWMNNNGGFATYELTRS 538 b-amyrin FAWLEVLNPSESFRNIVVDYPSVECTSSVVDALILFKETNPRYRRAEIDKCIEEAVVFIE 598 cycloart YAWLELINPAETFGDIVIDYPYVECTSAAIQALTSFKKLYPGHRRKDVDNCINKAANFIE 598 b-amyrin NSQNKDGSWYGSWGICFAYGCMFAVRALVATGKTYDNCASIRKSCKFVLSKQQTTGGWGE 658 cycloart SIQRSDGSWYGSWAVCFTYGTWEGVKALVAAGRTFKSSPAIRKACEFLMSKELPFGGWCK 658 b-amyrin DYLSSDNGEYIDSG--RPNAVTTSWAMLALIYAGQVERDPVPLYNAARQLMNMQLETGDF 716 cycloart SYLSCQDQVYTNLEGKHAHAVNTGWAMLTLIDAGQAERDPTPLHRAAKVLINLQSEDGEF 718 b-amyrin PQQEHMGCFNSSLNFNYANYRNLYPIMALGELRRRLLAIKS 757 cycloart PQQEIMGVFNKNCMISYSQYRDIFPVWALGEYRCRVLAAGK 759

(VIII)—Genomic sequence of β-amyrin synthase from Avena strigosa (SEQ ID NO: 36) ATGTGGAGGCTAACAATAGGTGAGGGCGGCGGTCCGTGGCTGAAGTCGAACAATGGCTTCCTTGGCCGCCAA GTGTGGGAGTACGACGCCGATGCCGGCACGCCGGAAGAGCGTGCCGAGGTTGAGAGGGTGCGTGCGGAATTC ACAAAGAACAGGTTCCAGAGGAAGGAGTCACAGGACCTTCTTCTACGCTTGCAGgtacatgcgtcttctttc ccctacttccatatacacccagtaqtatatgttgccactgccgttagctctagctttaggactgagaaaagg gctctcagataatccatatctctctttagatggagggtttgcttttatttattattacatattgttcaatcc ttgctgtgtatatcatcaactgcagTACGCAAAAGACAACCCTCTTCCGGCGAATATTCCGACAGAAGCCAA GCTTGAAAAGAGTACAGAGGTCACTCACGAGACTATCTACGAATCATTGATGCGAGCTTTACATCAATATTC CTCTCTACAAGCAGACGATGGGCATTGGCCTGGTGATTACAGTGGGATTCTCTTCATTATGCcTATcATTgt aagtattttactattattttatgatacagcaatttggcaattaatatatgcatacgaggtttcttatttcgt aaatactcaagacaatatagcatgtggaatcttataatttctataatgaatatgtaccgtcttgtgtgcgca atacgtatactatattattccgctatgcatatagtattacataccaatattgatagatgttcaaaccaatta tgaagagtttaactacaaqatttaatatagtagtttctgttattctagcagcaagttacctccattaggttc cqgaagttctactcttaccacctatatatatgtattattgcttatactaccttcgtctcaaagtttaagact tttttttaaagtcaatttatggaaagtttgaactaacttttataaaactatcaagaactatgatattatatt tgccatgtgaaaatatgttttattatgtatcaaagggtatggatttcgtaccgtaaatataatattgttgtc taaaatcttggttgaactttacttagtttgacttttggaaagtatataaqccttaaactttaaaatagacgt agtaattcagatgcacttcactgatatcccgacaaaagtacaaaatacatttatggaatgtcaaatttattt gaaaacaacacatttggtttagcttcaatatttcggaaaagaaaatatgaggagtgatttaaataagttctt aaggttttcatqaaaaacaaatctgttatggggactttatqcaaaqagaacaagattggctcttagaaattt ctttagatatgattaaattaaaatacagtgtttgcactaaaaccacatttggtttgatttgaatatttgaaa gagatagaaaatcttgaacatttatttttagggaatataggctttattactaccatcctatgtatcatcgat ggtggctcatcacattgatcacaactctgaaaactaagaagtctccaacatttagacaatgatattggtttt tcaaatttcagtaacacttacaagaattccgttgattttattctccatccgagaactcatttctcctctcct aataatgatgcacatatatgatgggatcttttctttatgttgcagATATTCTCTTTATATGTTACTAGATCA CTTGACACCTTTTTATCTCCGGAACATCGTCATGAGATATGTCGCTACATTTACAATCAACAGgcatgggat taaacctaacacatatttccatatttgttttctatatgtttgtgattttgtgaccaaaataaaaacagtact taatgcaacatatattgagcaqAATGAAGATGGTGGTTGGGGAAAAATGGTTCTTGGCCCAAGTACCATGTT TGGATCGTGTATGAATTATGCAACCTTAATGATTCTTGGCGAGAAGCGAAATGGTGATCATAAGGATGcATT GGAAAAAGGGCGTTCTTGGATTTTATCTCATGGAACTGCAACTGCAATACCACAGTGGGGAAAAATATGGTT GTCGgtatgttaaataacacaagatatcaatgctcatatatgttctcttctgaactaacgttaaatcaacct actatttgataacatcatagATAATTGGCGTTTACGAATGGTCAGGAAACAATCCTATTATACCTGAATTGT GGTTGGTTCCACATTTTCTTCCGATTGACCCAGgtatttctatctagcttgcatatataacaaaattgttgt agaacgcatgcttagaccatcattctgtggaattattctgtgcaatttgttgcttgtggaagcaatttaacc atatatcaaacaaggaatattgaggcatggtacctgaaatagttttttgaaaaatacatgccgaaaaggaaa tcaatgtttcaattaggcatgtttgcacgtagattccacaaqattctcttgtatatgttttgatcttggaga tacatgtatatatttatgtatctttcatattatctcaaaaaaataacatgttactaccccctctatccataa taagtgtcggtcacttagtacaaactttatactagcttagtacaaaatggacgactcttattatggattgca gggagtactaaatattatgaagttgaaccttatcattcacaagtaatttattggaaaataatccttcatatg tagGTCGTTTTTGGTGTTTTACCCGGTTGATATACATGTCAATGGCATATCTCTATGGTAAGAAATTTGTTG GGCCTATTAGTGCTACAATATTAGCTCTGCGACAAGACCTCTATAGTATACCTTACTGCAACATTAATTGGG ACAAGGCGCGTGATTATTGTGCAAAGgttagttagttaatcaatcactatatatatgtattcagtttgttag aatatattaatttagcccatgtcactacataatattttcatggattcaagattaagaacatcacgtagaata atgaagtacatcatttcaqtacttggtatctcagaaaaaatataqactaagaaagctagtgttcttcaaaaa ttttatgttgtttcagGAGGACCTTCATTACCCACGCTCACGGGCACAAGATCTTATATCTGGTTGCCTAAC GAAAATTGTGGAGCCAATTTTGAATTGGTGGCCAGCAAACAAGCTAAGAGATAGAGCTTTAACTAACCTCAT GGAGCATATCCATTATGACGACGAATCAACCAAATATGTGGGCATTTGCCCTATTAACAAGgtgaaattatt ttcaaattgatttgcaccttttactttaataatgacggatgttattccattctaatgttttaacatgtttat tgtaattagGCATTGAACATGATTTGTTGTTGGGTAGAAAACCCAAATTCGCCTGAATTCCAACAACATCTT CCACGATTCCATGACTATTTGTGGATGGCGGAGGATGGAATGAAGGCACAGgttggtatagagctcttgtca gatattttgccaatttaactacgtgccaattcttcacaaccattaacctttttcatgaatatatatttcctc aaacaaaatgtgagaatcttttgggttacaaggattttttattttcatctatatctaggttgcattcaataa gcatgtttgtgcatgtccgagttctcctgaaccaaactaaaatgcatattctctttagctgcacatagtgta tatgaaataaaattatggtaataatatttttactttagttaattctaatgacgaaatagttgatatgcctat atcgtttcgaatatataaatcagaggtagttagaaaaattattggacttacatcaaatgcaaactgtgaatg tataagtaatatgtatacaatcgcagGTATATGATGCATGTCATAGCTGGGAACTAGCGTTCATAATTCATG CCTATTGTTCCACGGATCTTAGTAGCGAGTTTATCCCGACTCTAAAAAAGGCGCACGAGTTCATGAAGAACT CACAGgtttgttgttctccatattatattattgctcaaattctgaaaagatctaacattaattgtctaccct tgaagGTTCTTTTCAACCACCCAAATCATGAAAGCTATTATCGCCACAGATCAAAAGGCTCATGGACCCTTT CAAGTGTAGATAATGGTTGGTCTGTATCTGATTGTACTGCGGAAGCTGTTAAGgttaacataagaaccatgt cttccaattgtacatatataagtacatatgtgaatacatgacgggttaccctgtataagttgaaatgaacta ttcatgaatatattgaatotacattaatattcattattttttcagGCATTGCTACTATTATCAAAGATATCc GCTGACCTTGTTGGCGATCCAATAAAACAAGACAGGTTGTATGATGCCATTGATTGCATCCTATCTTTCATG gtatgagaatctaaattggatcaattaacaaacgtacattactaaacaagggaaactatgcagacccattac taaagaaatgtgagcccacctagctagataattttatctaaaagtattaaattatattttgcacaacataca aaagttaaatttgttgtacaatgcatattattttctaaaaaaaatgcaaaaataatttggagaaattttata aggtagtccacggtaatttaatccatttctataatgcaaatagagtctcactaatagcaggtctccttttct cgttttgaagAATACAGATGGAACATTTTCTACCTACGAATGCAAACGGACATTCGCTTGGTTAGAGgttag tgatattcctttaaagttttataacatggtacaattaagatgaaatatcatttttgtattgtatgacttgtc catgagaacaaggtattgggattgaataagaagtcaaaagaaaaccaaatacaacaatgatatattaattgt aattcttatggtcattttgcatttctctttcatacccaagaaattttttctcctgaacaataagtttggata acoctatcccctttaacaaaatatctcttctacgagctagGTTCTCAAcCCTTCTGAGAGTTTTCGGAACAT TGTCGTGGACTATcCgtaagacaaaaaacacctacttcataaattatctttacttctatattcaaatattca ttttcgcgaactgacttgatatacataataatggtcagATCTGTTGAATGCACATCATCTGTGGTTGATGCT CTCATATTATTTAAAGAGACGAATCCACGATATCGAAGAGCAGAGATAGATAAATGCATTGAAGAAGCTGTT GTATTTATTGAGAACAGTCAAAATAAGGATGGTTCATGgtaagtgacatgatataaattatgcgttacaata acttttacttttgattaaatttgaaaatttattacttcttgtatctcatagGTATGGCTCATGGGGTATATG TTTCGCATATGGATGCATGTTTGCAGTAAGGGCGTTGGTTGCTACAGGAAAAACCTACGACAATTGTGCTTC TATCAGGAAATCATGCAAATTTGTCTTATCAAAGCAACAAACAACAGGTGGATGGGGTGAAGACTATCTTTC TAGTGACAATGGGgtaatataacaaactactttacccctataacattttactaatggtaaatcaaatccatc atgattattcatagatttaagtcatacatgatatagtaaacatagaaattgatactagttgagagttttgtt gttctataaatatactagttgagagttcagtagttctataagccatgcatggaattacgaaacattataaac tcaacgcaagatgatatagttgacaaattttaaaaatatcatatgtcttgttaaaaaaaagatcctagttat attttgagcatgaatatcttcaaatattgtgtatatgtgagaaaggtgatgtttaattgcacaaggtacact aaataaaatggttaatttgtgtatgcccaaaaaagagagatatagaaagagctaaagtaaacttttaattga cacatcttgttcttaacatttattttttatgaaagtctcagtacattcacgatacctagcaatatgaaaatt cattgattagacagaagataaaattgccattccacaattattaaatcatatatgttaatttttgcctttttg cttatttttgtcatgataatggatgcataacaccatgttttccaatggtctcatctactaatctcagatatc ttaatacagcttggtctacaactgttacaccccagttttgatgaccgtgtccacaaatacattataaactac tactaatgaoctaacacaaaaaaatgacgacaaagacaaatatccaatagaaggatcttttgtactgaacaa atgaagaaaattgtacatatatattgtgtaatatttaatttgttttcctttagtgtactccatccatgaaaa tgatattaaatcaatatttgcaatcatgcggtcaaatctgttcttctagatgcgtaagctaaagtcattata tgtatatatatattttcaagaacacataggcatgttgtgttttctaatcacgttttgtacagGAATATATTG ATAGCGGTAGGCCTAATGCTGTGACCACCTCATGGGCAATGTTGGCTTTAATTTATGCTGGACAGgtttgtc aaatatttttccttgtttgtctagaatatgaattttttattaaaaaggaaaagttctcactattcttgaata gtcgagttatctaacagaataatttatattttgtttttttaataagGTTGAACGTGACCCAGTACCACTGTA TAATGCTGCAAGACAGCTAATGAATATGCAGCTAGAAACAGGTGACTTCCCCCAACAGgtaatatgtttccg tcctacatgttttcaaacaaaaatgcaaagtaatcttaagtttaattgaaactgatcttttgttaatgaaac tcaatgtagaccttaagggaacaaccagtagaaataaaatacttgtgaattgataactctggaaagtgtatg cattaatgttggtgtgaatgtggtaaatgttggcattgcgtcataatttttgcatcggtacttacaaagttt aattaacactaatctcttgtcagattcaatgatcattaaaaattaagatataacctccatctagcttcttac ttacagtttcaccttggtaatagGACACATGGGTTGCTTCAACTCCTCCTTGAACTTCAACTACGCCAACT ACCGCAATCTATACCCGATTATGGCTCTTGGGGAACTTCGCCGTCGACTTCTTGCGATTAAGAGCTGA

(IX)—β-amyrin synthase promoter (SEQ ID NO: 37) ttcaaataaaaattacctgatctgacatgatcactggctacgccgagatt ctacaaatatttctataagtagtttgtggattccaatatatataoggatt ccgtaaagctctcttaccgatggtatgactttagtagtaaoaaaatcata ggcttcgagtgaagattggctaocaactgtaatgtaagattgttgtccaa gataagatactcaagttacagatgcactactctaatactaagagttattg atctatattacggctcccgtaccgtagagatattgattctacgttcacct tcttaaaaggagattcttgtacaatcaaaacaaatgggtctagctacctt ggtcaatatgtatttctatcggtatttagttataaaggagaggaatacag aataatttttttaactccatagtacctctattgctttcagtataaagagt ttgatgcacggttctctgtactaataaatgttctattgttgattgattct taaccgcatcctatgcaattttaacctcaaaaaagtttcacggtacaocg acttgccttactagccctactgttttcttgagaaggatgttcaaactttg ggcttttgcatctaaaataagacacacatcatttttggtttattattcaa caatgtgtgggaaaagcatacaacaatcaactcgatataccaccttcgcg gagggcctoctctttaaatgtctgggagtactacacatatgtaaagatga tgcccacttacaaagaacgaggacaccacttaaaccgggtgtacaaagta ctacacatatgtaaagacgaggccatagaacaagcaagagcaccaagata tttagatccactaaaatgcaaccacctogatgtccataaaaaatgatggt gacgcacaacactcaacaaatatcgataaaaatgatagtgtootagttgc acatcttotaacatgttggtgtctattatgcacaagtgggcatggaagca agtaaatattgtgtactatagctactggtgactcgagtgtatctccaaga ctcgatagcaaacccgaagcctcttcagcttgtccacatatcattgtgga atgttcactacgactcgccacgccaagcataacctggataagccacgtgg gatatgagatttcccgcagcttccctctgagtgaggaggcagaactatac gcctcaacacgacgagccaccccctaaggctagtcatagtgggagtaact tgggtagtaacatattcctacatatattgcgaactaagcatttagatgac atgacatgcaattaaatgatgagagagagtcttatgataactagctatgt taccataacatcacacatttctaaaaaaataaatctatattataataaat aaggttttgcatgataccacatctatgttattttgcactatgaagatagt aacttagactagtaacatatacatgttactactctaagttactccccaca atgaccagcctaacaccttttgtactgttttgcacatttgcagtttactt tttcttaggtgaagagaaaacacaagacataattttaatatttcaacttc attacgtgctggtgcaaataatttttacggtgcaattttcgacatgattt attgtatatttacagaaatttatgctccaaatttgtttggtaccttcagt attagtttctggacattgtacatattatgttgccgtataagctgagctag aaggatcattagtgtaattccatatatatctaaatgtacctgtggaatca catttgaggaagttccaatgatgccctttttgccctgcacacgcatatat aagaaccctttgcccgcagcatagagctagtactagctagtatcccattg cttgttttcctcgacatacactgcccgttgttggtgcgcaccATG

(X) Putative transcription factor binding sites of the promoter region of SEQ ID NO: 37 1  TTCAAATAAAAATTACCTGATCTGACATGATCACTGGCTACGCCGAGATT 51  CTACAAATATTTCTATAAGTAGTTTGTGGATTCCAATATATATACGGATT (101) +ANNWAAAGNNN(P$DOF1_01(0.959)) (101) +NWNWAAAGNGN(PSPBF_01(0.915)) 101  CCGTAAAGCTCTCTTACCGATGGTATGACTTTAGTAGTAACAAAATCATA (126)                          −NCNCTTTWNWN(P$PBF_01(0.939)) (126)                          −NNNCTTTWNNT(p$DOF1_01(0.966)) (168)                  +ACCWACCNN(P$P_01(0.856)) (174)                        +YAACSGMC(P$GAMYB_01(0.893)) 151  GGCTTCGAGTGAAGATTGGCTACCAACTGTAATGTAAGATTGTTGTCCAA (188)                                      −GKCSGTTR(P$GAMYB_01(0.905)) 201  GATAAGATACTCAAGTTACAGATGCACTACTCTAATACTAAGAGTTATTG 251  ATCTATATTACGGCTCCCGTACCGTAGAGATATTGATTCTACGTTCACCT (302)  +NWNWAAAGNGN(P$PBF_01(0.932)) (302)  +ANNWAAAGNNN(P$DOF1_01(0.919)) (342)                                          +ACCWACCNN(P$P_01(0.906)) 301  TCTTAAAAGGAGATTCTTGTACAATCAAAACAAATGGGTCTAGCTACCTT (380)                              +NWNWAAGNGN(P$PBF_01(0.915)) (380)                              +ANNWAAAGNNN(P$DOF1_01(0.931)) (397)                                             +NCAATTATTNNN(P$ATHB1–01(0.864)) 351  GGTCAATATGTATTTCTATCGGTATTTAGTTATAAAGGAGAGGAATATACAG (398)                                                  −NNAATAATTGNNN(P$ATHB1_01(0.948)) (440)                                         +NWNWAAAGNGN(P$PBF_01(0.936)) (440)                                         +ANNWAAAGNN(P$DOF1_01(0.941)) 401  AATAATTTTTTTAACTCCATAGTACCTCTATTGCTTTCAGTATAAAGAGT (407)       −NWWWTTAACNAYWM(P$SBF1_01(0.899)) (431)                               −NNNCTTTWNNT(P$DOF1_01(0.910)) (431)                               −NCNCTTTWNWN(P$PBF_01(0.906)) 451  TTGATGCACGGTTCTCTGTACTAATAAATGTTCTATTGTTGATTGATTCT (496)                                              −NWWWTTAACNAYWM(P$SBF1_01(0.882)) (500)                                                 −NTAACSGTTTTNN(P$MYBPH3_01(0.882)) (501)  +YAACSGMC(P$GAMYB_01(0.899)) (528)                                +ANNWAAGNNN(P$DOF1_01(0.931)) (528)                                +NWNWAAAGNGN(P$PBF_01(0.929)) 501  TAACCGCATCCTATGCAATTTTAACCTCAAAAAAGTTTCACGGTACACCG (517)                 −NWWWTTAACNAYWN(P$SBF1_01(0.961)) 551  ACTTGCCTTACTAGCCCTACTGTTTTCTTGAGAAGGATGTTCAAACTTTG (593)                                           −NNCTTTWNNT(P$DOF1_01(0.879)) (593)                                           −NCNCTTTWNWN(P$PBF_01(0.860)) (600)                                                  −NCNCTTTWNWN(P$PBF_01(0.938)) (600)                                                  −NNNCTTTWNNT(P$DOF1_01(0.943)) (637)                                     +NNNCAATTATTNNN(P$ATHB1_01(0.869)) 601  GGCTTTTGCATCTAAAATAAGACACACATCATTTTTGGTTTATTATTCAA (659)         +ANNWAAAGNNN(P$DOF1_01(0.925)) (659)         +NWNWAAAGNGN(P$PBF_01(0.922)) (671)                     +YAACSGMC(P$GAMYB_01(0.905)) 651  CAATGTGTGGGAAAAGCATACAACAATCAACTCGATATACCACCTTCGCG (739)                                       +NWNWAAAGNGN(P$PBF_01(0.924)) 739)                                       +ANNAAAGNNN(P$DOF1_01(0.965)) 701  GAGGGCCTCCTCTTTAAATGTCTGGGAGTACTACACATATGTAAAGATGA (709)         −NCNCTTTWNWN(P$PBF_01(0.957)) (709)         −NNNCTTTWNNT(P$DOF1_01 (0.971)) (758)       +NWNWAAAGNGN(P$PBF_01(0.901)) (758)       +ANNWAAAGNNN(P$DOF1_01(0.877)) (791)                                         +ANNWAAAGNNN(P$DOF1_01(0.871)) (791)                                         +NWNWAAAGNGN(P$PBF_01(0.896)) 751  TGCCCACTTACAAAGAACGAGGACACCACTTAAACCGGGTGTACAAAGTA (809)         +NWNWAAAGNGN(P$PBF_01(0.932)) (809)         +ANNWAAAGNNN(P$DOF1_01(0.965)) 801  CTACACATATGTAAAGACGAGGCCATAGAACAAGCAAGAGCACCAAGATA (867)                 +ACCWACCNN(P$P_01(0.852)) (869)                   +YAACSGMC(P$GAMYB_01(0.947)) 851  TTTAGATCCACTAAAATGCAACCACCTCGATGTCCATAAAAAATGATGGT (907)      +YAACSGMC(P$GMYB_01(0.897)) 901  GACGCACAACACTCAACAAATATCGATAAAAATGATAGTGTCCTAGTTGC 951  ACATCTTCTAACATGTTGGTGTCTATTATGCACAAGTGGGCATGGAAGCA 1001  AGTAAATATTGTGTACTATAGCTACTGGTGACTCGAGTGTATCTCCAAGA 1051  CTCGATAGCAAACCCGAAGCCTCTTCAGCTTGTCCACATATCATTGTGGA (1141)                                         +NNNSACGTGNCM(P$GBP_Q6(0.949)) 1101  ATGTTCACTACGACTCGCCACGCCAAGCATAACCTGGATAAGCCACGTGG (1141)                                         −KGNCACGTSNNN(P$GBP_Q6(0.944)) 1151  GATATGAGATTTCCCGCAGCTTCCCTCTGAGTGAGGAGGCAGAACTATAC 1201  GCCTCAACACGACGAGCCACCCCCTAAGGCTAGTCATAGTGGGAGTAACT 1251  TGGGTAGTAACATATTCCTACATATATTGCGAACTAAGCATTTAGATGAC (1251) −NNGGTWGGT(P$P_01(0.855)) (1338)                                      +YAACSGMC(P$GPAMYB_01(0.857)) 1301  ATGACATGCAATTAAATGATGAGAGAGAGTCTTATGATAACTAGCTATGT 1351  TACCATAACATCACACATTTCTAAAAAAATAAATCTATATTATAATAAAT 1401  AAGGTTTTGCATGATACCACATCTATGTTATTTTGCACTATGAAGATAGT 1451  AACTTAGACTAGTAACATATACATGTTACTACTCTAAGTTACTCCCCACA 1501  ATGACCAGCCTAACACCTTTTGTACTGTTTTGCACATTTGCAGTTTACTT (1514)              −NCNCTTTWNWN(P$PBF_01(0.927)) (1514)              −NNNCTTTWNNT(P$DOF1_01(0.933)) (1545)                                            −NCNCTTTWNWN(P$PBF_01(0.945)) 1545)                                            −NNNCTTTWNNT(P$DOF1_01(0.940)) (1579)                               +KWRTNGTTAAWWWN(P$SBF1_01(0.854)) 1551  TTTCTTAGGTGAAGAGAAAAACACAAGACATAATTTTAATATTTCAACTTC (1581)                                 −NWWWTTAACNAYWM(P$SBF1_01(0.878)) (1613)             +NNNCAATTATTNNN(P$ATHB1_01(0.860)) 1601  ATTACGTGCTGGTGCAAATAATTTTTACGGTGCAATTTTCGACATGATTT (1614)              −NNNAATAATTGNNN(P$ATHB1—01(0.945)) 1651  ATTGTATATTTACAGAAATTTATGCTCCAAATTTGTTTGGTACCTTCAGT 1701  ATTAGTTTCTGGACATTGTACATATTATGTTGCCGTATAAGCTGAGCTAG 1751  AAGGATCATTAGTGTAATTCCATATATATCTAAATGTACCTGTGGAATCA 1801  CATTTGAGGAAGTTCCAATGATGCCCTTTTTGCCCTGCACACGCATATAT (1823)                       −NNNCTTTWNNT(P$DOF1_01(0.913)) (1823)                       −NCNCTTTWNWN(PSPBF_01(0.943)) 1851  AAGAACCCTTTGCCCGCAGCATAGAGCTAGTACTAGCTAGTATCCCATTG (1655)     −NCNCTTTWNWN(P$PBF_01(0.880)) (1855)     −NNNCTTTWNNT(P$DOF1_01(0.859)) 1901  CTTGTTTTCCTCGCATACACTGCCCGTTGTTGGTGCGCACCATG (1922)                      −GKCSGTTR(PSGAMYB_01(0.961))

TABLE 1 Specific primers for amplifying A. strigosa β- amyrin synthase Amino acid No. sequence Forward primers Reverse primers I. KSNNGFL AAGTCGAACAATGGCTTCCTT AAGGAAGCCATTGTTCGACTT II. VWEYDADAGT GTGGGAGTACGACGCCGATGCCGGCACG CGTGCCGGCATCGGCGTCGTA CTCCCAC III. RVRAEFTK AGGGTGCGTGCGGAATTCACAAAG CTTTGTGAATTCCGCACGCAC CCT IV. PANIPTEA CCGGCGAATATTCCGACAGAAGCC GGCTTCTGTCGGAATATTCGC CGG V. KSTEVTHETIYES AAGAGTACACAGGTCACTCACGAGACTAT TGATTCGTAGATAGTCTCGTG CTACGAATCA AGTGACCTCTGTACTCTT VI. IMPINIFS ATTATGCCTATCAATATATTCTCT AGAGAATATATTGATAGGCAT AAT VII. RSLDTFLS AGAGAATATATTGATAGGCATAAT ATTATGCCTATCAATATATTC TCT VIII. NQQNEDGGWGKMV AATCAACAGAATGAAGATGGTGGTTGGGG TGGGCCAAGAACCATTTTTCC LGP AAAAATGGTTCTTGGCCCA CCAACCACCATCTTCATTCTG TTGATT IX. GSCMNYATLM GGATCGTGTATGAATTATGCAACCTTAAT CATTAAGGTTGCATAATTCAT G ACACGATCC X. KRNGDHKDALEK AAGCGAAATGGTGATCATAAGGATGCATT TTTTTCCAATGCATCCTTATG GGAAAAA ATCACCATTTCGCTT XI. SHGTATAIPQ TCTCATGGAACTGCAACTGCAATACCACA CTGTGGTATTGCAGTTGCAGT G TCCATGAGA XII. IIPELWLVPH ATTATACCTGAATTGTGGTTGGTTCCACA ATGTGGAACCAACCACAATTC T AGGTATAAT XIII. RFWCFTRLIYMSM CGTTTTTGGTGTTTTACCCGGTTGATATA TGCCATTGACATGTATATCAA A CATGTCAATGGCA CCGGGTAAAACACCAAAAACG XIV. ALRQDLYSIPYCN GGTCTGCGACAAGACCTCTATAGTATACC GTTGCAGTAAGGTATACTATA TTACTGCAAC GAGGTCTTGTCGCAGAGC XV. WDKARDYC TGGGACAAGGCGCGTGATTATTGT ACAATAATCACGCGCCTTGTC CCA XVI. RSRAQDLISGCLT CGCTCACGGGCACAAGATCTTATATCTGG TTTCGTTAGGCAACCAGATAT K TTGCCTAACGAAA AAGATCTTGTGCCCGTGAGCG XVII. ILNWWPANKLR ATTTTGAATTGGTGGCCAGCAAACAAGCT TCTTAGCTTGTTTGCTGGCCA AAGA CCAATTCAAAAT XVIII. DRALTNLMEH GATAGAGCTTTAACTAACCTCATGGAGCA ATGCTCCATGAGGTTAGTTAA T AGCTCTATC XIX. STKYVGICPI TCAACCAAATATGTGGGCATTTGCCCTAT AATAGGGCAAATGCCCACATA T TTTGGTTGA XX. ICCWVENPNSPE ATTTGTTGTTGGGTAGAAAACCCAAATTC TTCAGGCGAATTTGGGTTTTC GCCTGAA TACCCAACAACAAAT XXI. AQVYDGCHS GCACAGGTATATGATGGATGTCATAGC GCTATGACATCCATCATATAC CTGTGC XXII. ELAFIIHAYC GAACTAGCGTTCATAATTCATGCCTATTG ACAATAGGCATGAATTATGAA T CGCTAGTTC XXIII. STDLTSEFI TCCACGGATCTTACTAGCGAGTTTATC GATAAACTCGCTAGTAAGATC CGTGGA XXIV. LFNHPNHESY CTTTTCAACCACCCAAATCATGAAAGCTA ATAGCTTTCATGATTTGGGTG T GTTGAAAAG XXV. LSSVDNGWS CTTTCAAGTGTAGATAATGGTTGGTCT AGACCAACCATTATCTACACT TGAAAG XXVI. KISADLVGDPIKQ AAGATATCCGCTGACCTTGTTGGCGATCC GTCTTGTTTTATTGGATCGCC D AATAAAACAAGAC AACAAGGTCAGCGGATATCTT XXVII. IDCILSFMNTD ATTGATTGCATCCTATCTTTCATGAATAC ATCTGTATTCATGAAAGATAG AGAT GATGCAATCAAT XXVIII. TFSTYECKRTFA ACATTTTCTACCTACGAATGCAAACGGAC AGCGAATGTCCGTTTGCATTC ATTCGCT GTAGGTAGAAAATGT XXIX. NPSESFRN AACCCTTCTGAGAGTTTTCGGAAC GTTCCGAAAACTCTCAGAAGG GTT XXX. VVDALIL GTGGTTGATGCTCTCATATTA TAATATGAGAGCATCAACCAC XXXI. ETNPRYRRA GAGACGAATCCACGATATCGAAGAGCA TGCTCTTCGATATCGTGGATT CGTCTC XXXII. DKCIEEAVVF GATAAATGCATTGAAGAAGCTGTTGTATT AAATACAACAGCTTCTTCAAT T GCATTTATC XXXIII. CMFAVRALVAT TGCATGTTTGCAGTAAGGGCGTTGGTTGC TGTAGCAACCAACGCCCTTAC TACA TGCAAACATGCA XXXIV. DNCASIRKSCK GACAATTGTGCTTCTATCAGGAAATCATG TTTGCATGATTTCCTGATAGA CAAA AGCACAATTGTC XXXV. VLSKQQTT GTCTTATCAAAGCAACAAACAACA TGTTGTTTGTTGCTTTGATAA GAC XXXVI. DYLSSDNGEYIDS GACTATCTTTCTAGTGACAATGGGGAATA GCTATCAATATATTCCCCATT TATTGATAGC GTCACTAGAAAGATAGTC XXXVII. GRPNAVTTS GGTAGGCCTAATGCTGTGACCACCTCA TGAGGTGGTCACAGCATTAGG CCTACC XXXVIII. YAGQVERDPV TATGCTGGACAGGTTGAACGTGACCCAGT TACTGGGTCACGTTCAACCTG A TCCAGCATA XXXIX. YNAARQLMNMQLE TATAATGCTGCAAGACAGCTAATGAATAT TGTTTCTAGCTGCATATTCAT T GCAGCTAGAAACA TAGCTGTCTTGCAGCATTATA XL. CFNSSLNFNYANY TGCTTCAACTCCTCCTTGAACTTCAACTA GTAGTTGGCGTAGTTGAAGTT CGCCAACTAC CAAGGAGGAGTTGAAGCA XLI. IMALGELRRRLLA ATTATGGCTCTTGGGGAACTTCGCCGTCG TCAGCTCTTAATCGCAAGAAG IKS ACTTCTTGCGATTAAGAGCTGA TCGACGGCGAAGTTCCCCAAG AGCCATAAT

In Table I, Amino acid sequences I–XLI correspond to SEQ ID NOS: 50–90; Forward primers I–XLI correspond to SEQ ID NOS: 91–131; Reverse primers I–XLI correspond to SEQ ID NOS: 132–172.

TABLE 2 In Table 2 - Forward primers 1–11 and reverse primers 1–11 correspond to SEQ ID NOS: 173–183 and 184–194 respectively. Num- ber Forward primers Reverse primers 1 ASWSKAMMAATRRYTTYVTW WABRAARYYATTKKTMSWST 2 WSRRAKAACMGGTWYCAG CTGRWACCKGTTMTYYSW 3 KYTMTTYWTYMTKCCDMYC GRKHGGMAKRAWRAAKARM 4 KWCGCTACATWTACWRTC GAYWGTAWATGTAGCGWM 5 TGGTGRTSWWAASRATGCMT AKGCATYSTTWWSAYCACCA 6 KGATCTHAYTVRYGARWTTVKH DMBAAWYTCRYBARTDAGATCM 7 AAARSYATNYATCGYCACA TGTGRCGATRNATRSYTTT 8 GARTGCACWTCATCDGYR YRCHGATGAWGTGCAYTC 9 GGAAARACMTACDACAAYT ARTTGTHGTAKGTYTTTCC 10 YTTYCCCCAACAGGAAMWM KWKTTCCTGTTGGGGRAAR 11 TACCGSAATMTATACCCRWT AWYGGGTATAKATTSCGGTA

TABLE 3 primers used in order to cover the complete genomic seq- uence XLII. AMYstaF5: 5′-CCATGTGGAGGCTAACAATAGGTGAGGG-3′ XLIII. AMYstaF6: 5′-TTTCCTCGCATACACTGCCCGTTGTT-3′ XLIV. AMY01F: 5′-TATTCCGACAGAAGCCAA-3′ XLV. AMY01R: 5′-TCTTTGTGAATTCCGCAC-3′ XLVI. AMY02F: 5′-GTTGGGGAAAAATGGTTC-3′ XLVII. AMY03F: 5′-TTTTCTTCCGATTCACCC-3′ XLVIII. AMY04F: 5′-ATTGTGGAGCCAATTTTG-3′ XLIX. AMY05F: 5′-TGGATGTCATAGCTGGGA-3′ L. AMY06F: 5′-TATCCGCTGACCTTGTTG-3′ LI. AMY07F: 5′-CGAATCCACGATATCGAA-3′ LII. AMY08F: 5′-GTGGATGGGGTGAAGACT-3′ LIII. AMY09F: 5′-TATGGCTCTTGGGGAACT-3′ LIV. AMY10F: 5′-GTATGGATTTCGTACCGTAAAT-3′ LV. AMY11F: 5′-CCTGCGACAAGACCTCTATA-3′ LVI. AMY12F: 5′-CTCAACCCTTCTGAGAGTTTT-3′ LVII. AMY13F: 5′-CAGCTTGGTCTACAACTGTTAC-3′ LVIII. AMY014F: 5′-CAGAGGTAGTTAGAAAAATTATTGGACT-3′ LIX. AMY015F: 5′-GGCGGAGGATGGAATGAAGGCA-3′ LX. AMY016F: 5′-CAGTGGGATTCTCTTCATTATGC-3′ LXI. AMY017R: 5′-AGCCTTTTGATCTGTGGCGATA-3′ LXII. AMY018F: 5′-GTGGCTCATCACATTGATCACA-3′ LXIII. AMY019F: 5′-TGGGCAATGTTGGCTTTAATTT-3′ LXIV. AMYendR3: 5′-GCCTAGACATCCATGTTTGTTTCCATATCA-3′ LXV. AMYendR5: 5′-TATTTCTCAAAGAAAATAACGCATGAATGCTC-3′

In Table 3, primer XLII corresponds to SEQ ID NO: 5; primers LXIII–LXIV correspond to SEQ ID NOS: 195–216 and primer LXV corresponds to SEQ ID NO: 6

Blast Analysis

Partial cDNA Sequence (clone ort1s.pk001.c14)

Blastx search at the National Center for Biotechnology Information (NCBI) revealed homology with:

The cycloartenol synthase (AB025968) from Glycyrrhiza glabra. Score=109 (bits), E−Value=5e−24, Identities=45/86 (52%).

The cycloartenol synthase (AB033334) from Luffa cylindrica. Score=109 (bits), E−Value=4e−24, Identities=44/86 (51%).

The cycloartenol synthase (AF169966) from Oryza sativa. Score=107 (bits), E−Value=2e-23, Identities=48/83 (57%).

The oxidosqualene cyclase (AB025353) from Allium macrostemon. Score=104 (bits), E−Value=1e−22, Identities=47/83 (56%).

The cycloartenol synthase (AC005171) from Arabidopsis thaliana. Score=104 (bits), E−Value=2e−22, Identities=44/86 (51%).

The cycloartenol synthase (AB009029) from Panax ginseng. Score=104 (bits), E−Value=1e−22, Identities=43/86 (50%).

The cycloartenol synthase (D89619) from Pisum sativum. Score=102 (bits), E−Value=6e−22, Identities=43/86 (50%).

The beta-amyrin synthase (AB014057) from Panax ginseng. Score=98.7 (bits), E−Value=9e−21, Identities=43/84 (51%).

For the search the non-redundant GenBank CDS translations, PDB, SwissProt, SPupdate and PIR databases were used with a BLOSUM62 Matrix and the Existence11, Extension1 Gap Penalties.

Below are shown the results of further Blastx searches at the National Center for Biotechnology Information (NCBI), using the non-redundant GenBank CDS translations, PDB, SwissProt, SPupdate and PIR databases with a BLOSUM62 Matrix, showed homology to the beta-amyrin synthase cDNA from Panax ginseng and to cycloartenol synthase genes from other plant species.

1. Cycloartenol synthase (D89619) from Pisum sativum. Score=(131), E−Value=4e−30, Identities=58/127 (45%).

2. Cycloartenol synthase (AC005171) from Arabidopsis thaliana. Score=(131), E−Value=4e−30, Identities=57/127 (44%).

3. Cycloartenol synthase (AF169966) from Oryza sativa. Score=(130), E−Value=5e−30, Identities=63/125 (50%).

4. Cycloartenol synthase(AB025968) from Glycyrrhiza glabra. Score=(128), E−Value=2e−29, Identities=57/127 (44%).

5. Oxidosqualene cyclase (AB025353) from Allium macrostemon. Score=(126), E−Value=9e−29, Identities=57/127 (44%).

6. Cycloartenol synthase (AB009029) from Panax ginseng. Score=(125), E−Value=1e−28, Identities=54/127 (42%).

7. Beta-amyrin synthase (AB014057) from Panax ginseng. Score=(118), E−Value=3e−26, Identities=57/127 (44%). 

1. An isolated nucleic acid which comprises a nucleotide sequence selected from the group consisting of: a) a nucleotide sequence encoding the polypeptide of SEQ ID NO: 15; b) the nucleotide sequence of SEQ ID NO: 14; c) the nucleotide sequence of SEQ ID NO: 36; and d) a nucleotide sequence encoding a functional β amyrin synthase having at least 95% homology to SEQ ID NO:
 15. 2. The isolated nucleic acid as claimed in claim 1, wherein the nucleotide sequence comprises SEQ ID NO: 14 or SEQ ID NO:
 36. 3. The isolated nucleic acid as claimed in claim 1, wherein the nucleotide sequence encodes the polypeptide of SEQ ID NO:
 15. 4. An isolated nucleic acid which comprises the complement of the nucleotide sequence of SEQ ID NO: 14 or
 36. 5. A recombinant vector which comprises the nucleic acid of claim
 1. 6. The recombinant vector as claimed in claim 5, wherein the nucleic acid is operably linked to a promoter.
 7. The recombinant vector as claimed in claim 5 which is a plant transformation vector.
 8. A method of producing a transformed host cell comprising introducing the vector of claim 5 into a host cell.
 9. A host cell transformed with the nucleic acid of claim
 1. 10. The host cell as claimed in claim 9, which is Saccharomyces cerevisiae.
 11. The host cell as claimed in claim 9, which is a plant cell.
 12. A method for producing a transgenic plant, which method comprises the steps of: a) introducing the vector of claim 5 into a plant cell to produce a transformed plant cell; and, b) regenerating a plant from the transformed plant cell.
 13. A transgenic plant which is produced by the method of claim
 12. 14. The transgenic plant as claimed in claim 13, which is selected from the group consisting of barley, phaseolus, pea, sugar beet, maize, oat, solanum, allium, cucurbitaceae, yam, rice, rye, sorghum, soyabean, spruce, strawberry, sugarcane, sunflower, tomato, and wheat.
 15. A transgenic seed or progeny from the transgenic plant as claimed in claim
 14. 16. A method of making a polypeptide having β-amyrin synthase activity, which method comprises the step of causing or allowing expression of the polypeptide from the nucleic acid of claim 1 in a suitable host cell; and isolating the polypeptide from the host cell.
 17. A method for altering triterpenoid synthesis in a plant, which method comprises introducing the nucleic acid as claimed in claim 1 into the plant.
 18. A method for altering resistance to a fungal pathogen in a plant, which method comprises introducing the nucleic acid as claimed in claim 1 into the plant.
 19. The method as claimed in claim 18, wherein the fungal pathogen is selected from the group consisting of Gaeumannomyces var avenae, Gaeumannomyces var tritici, Fusarium culmorum, Fusarium avanaceum, Stagonospora nodorum, and Stagonospora avena.
 20. A method as claimed in claim 17 wherein the triterpenoid is an oleanane-type triterpene saponin, and the method further comprises of isolating said triterpenoid from the plant.
 21. A method for reducing the level of at least on triterpenoid in a plant, which method comprises introducing the nucleic acid of claim 4 into the plant.
 22. A recombinant vector which comprises the nucleic acid of claim
 4. 23. A host cell transformed with the nucleic acid of claim
 4. 